Posts tagged with “programming”
One-liner to extract a list of link addresses from an HTML file
I'm moving my research group's website to a new server and making some updates at the same time. One of the main things I need to do is make sure links are going to work after the transition. Here is a little one-line shell "script" (if you can call it that) that will extract link addresses from an HTML web page:
wget -q -O - http://www.google.com | tr " " "\n" | grep "href" | cut -f2 -d"\""
wget fetches the file and outputs its content to stdout. tr replaces all spaces with newlines, grep filters out every line that doesn't contain an "href", and finally cut displays everything between the first pair of double-quotes.
If you want to use a file you have on your local machine, you can use this variant instead:
tr " " "\n" < [file_name.html]| grep "href" | cut -f2 -d"\""
Obligatory disclaimer: HTML is NOT a regular language and in general cannot be parsed with regex's as is done here. This is not guaranteed to work.
python-mysqldb: execute() first
While working on implementing a schema-free, MySQL-backed data store (thanks, Bret Taylor!), I ran into a problem with using MySQLdb to access the database. I'll eventually post the code I wrote up on this site so others can see my example, but for now the following will suffice. When performing a SELECT, I would get the following error upon attempting to fetch my results.
Incorrect Python:
q = "SELECT body FROM entities WHERE id='%s'" % (entity_id)
self.conn.cursor().execute(q)
entity = self.conn.cursor().fetchone()
Error:
File "datastore.py", line 93, in get
entity = self.conn.cursor().fetchone()
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 340, in fetchone
self._check_executed()
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 70, in _check_executed
self.errorhandler(self, ProgrammingError, "execute() first")
File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: execute() first
The issue was using two separate cursor objects. Here's the corrected code:
c = self.conn.cursor()
q = "SELECT body FROM entities WHERE id='%s'" % (entity_id)
c.execute(q)
entity = c.fetchone()
Kohana Documentation Offline Snapshot
This is a snapshot of the Kohana documentation as of Aug 22, 2009 that can be used offline. It's not pretty but it is nice if, say, you're about to get on a plane and want to be able to reference the Kohana documentation.
Processing Source Attachments for Eclipse
This file contains the source for processing.core from Processing 1.0.5, as of today the most recent release of the Processing library. You can specify this zip file as the source attachment in Eclipse for core.jar (right click on core.jar > Properties… > Java Source Attachment > External File) to view the source code for the processing.core package.