Thomas Andreas Jung's Blog

Showing posts with label python. Show all posts

20121128

My little stupid Raspberry Pi project: 1 LED and 1 Switch

I recently bought a Raspberry Pi. My goal is to build a living room mp3 player. As I’m a software guy the programming part isn’t a problem, but the mp3 player has to be able to respond to some buttons having a semantic like: “Hey, next song please!”. So I have to do some simple hardware. Sadly, my knowledge about electronics has faded. This is the journey of electronics ignorant in hardware land.

As the a first step, I put two examples from the raspbian user guide together. Pressing a switch toggles a LED, enabling and disabling it. Right now I’ve no idea how to calculate the pull-up-resistor for the switch and the resistor for the LED. I was happy to see that it simply worked.

Parts: Breadboard, 10k Ohm, 150 Ohm, Switch, 1 Led, Jumper Wires

What I learnt so far

There are four and five band resistor color codings. One half of the resistors I use are four band coded and the other half five band coded. Man, this is like software world with downward compatibility. Can’t they just use five bands?

The world outside of a computer has x,y,z axis. There is a difference between nice, short and too f***ing short cables. Because they had only male jumper wires at my store, things are a bit improvised anyway. I had to fit an end connector pin on the male wire. The guy at the shop said I should solder it on. Yeah right, the first thing I’ll do is soldering; adding some shrink tubing is also a good idea. The x,y,z problem also applies to switches. They actually have to fit into the Breadboard.

Software-wise the project is very simple. I use the raspberry pi GPIO python library. I read the state of the switch from port 12 and enable the LED on the first and disable the LED on the second state change.

import RPi.GPIO as GPIO
import time
GPIO.setmode(GPIO.BOARD)
GPIO.setup(11, GPIO.OUT)
GPIO.setup(12, GPIO.IN)
state = False

def wait():
 time.sleep(0.1)
while True:
 wait() #don't burn the CPU
 if not GPIO.input(12):
    print state
    state = not state
    GPIO.output(11, state)
    while not GPIO.input(12):
            wait()
            continue

Not very surprisingly it’s a good idea to not poll the input port constantly. Otherwise the python process would use 100% of the available CPU time. I don’t know yet how to replace the polling with GPIO interrupts.

I think I won the first round of the hardware game:
a) It works.
b) My Raspberry is still alive.
c) It’s quite clear I have to learn and practice a lot.

20121010

My best Python HTTP test server so far

I’ve implemented a bunch of test HTTP servers in Python and now I think I’ve got the implementation right:

Test client code doesn’t have to care about the specific HTTP port. Any free port can be used by the test without interference with other running processes.
The HTTP server is up and running when the test is ready.
Resource handling uses the the with statement. The HTTP server is shut-down at the end of the test.
The concrete request urls (host and port) are transparent for the test.
Test can be run fully in memory. The only resource allocated is the HTTP socket.

The actual test is brief. We call the http_server with a HTTP handler and the function url is returned. The test can use the url function to create the request url. This is handy as the allocated port of the HTTP server is not fixed. In the test below we check that the returned content matches.

with http_server(Handler) as url:
    assert list(urllib2.urlopen(url("/resource"))) == [content]

To run this test we need a HTTP handler implementation. You could use the SimpleHTTPServer.SimpleHTTPRequestHandler that comes with python and work with files served from a directory. This is any good point to start but setting up a test folder with the necessary content is cumbersome and inflexible.

This handler runs in memory without any additional setup. It will always returns with the 200 response code writes the content into the request.

code, content = 200, "Ok"
class Handler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
      self.send_response(code)
      self.wfile.write("\n" + content )

The http_server implementation starts a thread, opens a socket and yields the url function. The HTTP request handler runs in the spawned thread.

@contextlib.contextmanager
def http_server(handler):
  def url(port, path):
      return 'http://%s:%s%s' % (socket.gethostname(), port, path)
  httpd = SocketServer.TCPServer(("", 0), handler)
  t = threading.Thread(target=httpd.serve_forever)
  t.setDaemon(True)
  t.start()
  port = httpd.server_address[1]
  yield functools.partial(url, port)
  httpd.shutdown()

I leave it as an exercise to you to write an implementation that reuses the HTTP server in multiple tests. This could be necessary if the overhead of allocating ports dominates the test running time.

20120805

Archiving twitter messages without shortened URLs with Python Twitter Tools

Version 1.9 of the Python Twitter Tools include the new feature to follow redirects of tweeted urls. The developer behind twitter tools Mike Verdone accepted my patch.

The goal is to archive readable URLs in the tweet archive by replacing all URLs from shortening services. You can run it with twitter-archiver -f ThomasAJung to follow all links or twitter-archiver -r bit.ly,t.co,goo.gl ThomasAJung to follow the links of selected hosts.

The archived output changes from:
228771182638419968 2012-07-27 10:37:54 CEST Believe it or not there are people who see interop as a bad thing. It interferes with their business model, ... http://t.co/AhAg41x9

to:
228771182638419968 2012-07-27 10:37:54 CEST Believe it or not there are people who see interop as a bad thing. It interferes with their business model, ... http://scripting.com/stories/2012/07/26/oauth1IsFine.html

You can read the Shortcomings section of the Wikipedia article about URL shortening if you are interested why you should replace the shortened URLs.

20111213

Why I Created Daily Feed Recycler

The motivation behind Daily Feed Recycler is that there is too much content on the Internet: good and bad. Once I found a good source, I do not have enough time to actually read everything. I just can’t read 5k word articles in a row. Even if I had the time to do it, I cannot mentally. It’s not fun.

Feeds are a wonderful tool for authors and readers. They allow to stay informed about the changes on a page. Feed aggregators are part of this reading experience. They let you manage feeds: read articles, mark as read, subscribe and aggregate feeds. Applications creating feeds get this for free and there are a multitude of feed aggregators available.

Feeds help with time problem to some extent, but they - wonderful as they are - have a dark side. Instead of solving the problem of organizing content in a way that you can read really good stuff, they organize content in a way that you can read the latest stuff. Yes, the new content is cool, but the classics are here to stay. You will not get a tweet from Goethe. With Daily Feed Recycler the good stuff is on an equal footing with the latest stuff. The content for every day is presented as new stuff in your feed aggregator.

Content has to be presented in a digestible manner. Deep reading, not just skimming to get an overview. Time and mental energy has to be organized in a way to allow reading. The Daily Feed Recycler is thought of as a way to break down content in smaller parts and add a reminder that there’s still good content to read. Using your feed aggregator you can decide when you read it, if you ignore it or if you read it at all. Feed aggregators are quite good nowadays and their flexibility is useful here.

You can now create your own channel of daily content: the full bash reference, the mayor Linux man pages or the list of all decision biases from Wikipedia. Everything you like. I’m often looking for daily feeds but there are not that many of them out there, because it’s a lot of work to do and depends on a certain level of expertise. Daily Feed Recycler is not a competitor for the existing curated daily feeds. A curated feed can have a much better quality through an expert selection and logical ordering of content.

Daily Feed Recycler follows the “Release early, Release often” philosophy. There are bugs, missing features and rough edges. I hope you find it useful nonetheless.

20110715

Find cruft with a new Mercurial extension

After some fun with the quick and untested shell script that finds the oldest code in a Subversion repository, it is the next step to write a Mercurial extension. The simple Mercurial extension cruft does basically the same job as the shell scripts for Subversion. Being an extension it’s nicely integrated into Mercurial as the other extensions.

Python and Mercurial are relatively easy to get into. Mercurial provides the Developer Info page which is really good. Additionally, there’s a guide how to write a Mercurial extension. The guide is good start for the Mecurial development. The rest can be easily picked up by reading the code of other commands and extensions.
The code is readable and there are no big hurdles.

The only thing I missed while writing the extension is type information in method signatures. As much as I like Python it’s ridiculous to write the type information in the pydoc and let the developer figure out the types. This one of the trade-offs you have to live with.

Testing Mercurial extensions

It suffices to understand the integration tests tool Mercurial uses to test the extension itself. There’s some documentation for this as well. The basic idea behind Cram is to start a process and check against the expected output.

The integration test tool defines a small language. All lines that have no indentation are comments. Indented lines starting with $ are executed and all other lines are the expected output. For example a test looks like this:

init

 $ hg init

 $ cat <<EOF >>a
 > c1
 > c2
 > EOF
 $ hg ci -A -m "commit 0"
 adding a

cruft

 $ hg cruft
 0 a c1
 0 a c2

First a repository is initialized: a file called a with the content (c1,c2) is committed and then the Mercurial is started with the cruft command. Without options the cruft extension prints all lines with newest lines first. The expected output is (0 a c1, 0 a c2) which is means the revision 0 file a and line c1; revision 0 file a and line c2.

It’s fairly easy to get started with this tool. The only downside in my tests is that the they reuse the same test fixture and do not reset the fixture for each test. They are not executed in isolation, which has a whole range of problems - redundancy and readability for example - but I didn’t feel that it was worth the effort to structure the tests otherwise.

Installing the extension

The easiest way to install the extension is to download the cruft.py to a local folder and add a link to the extension file in the .hgrc file.

[extensions]
cruft=~/.hgext/cruft.py

Using the extension

After the installation you can execute pretty much the same commands as with the shell script version.

hg help cruft

hg cruft

(no help text available)

options:

-l --limit VALUE   oldest lines taken into account
-c --changes       biggest change sets
-f --files         biggest changes per file
-X --filter VALUE  filter lines that match the regular expression
    --mq            operate on patch repository

use "hg -v help cruft" to show global options

I use here the quickcheck source code to show some sample output.

hg cruft -l 5 -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*|package.*)$" quickcheck-core/src/main

This finds the oldest 5 lines using the Java source code specific exclusion pattern (parentheses, imports, class definitions etc.) for the quickcheck-core/src/main folder. The output contains the revision number, source file and source code line.

5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java public Object[] next() {
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java ArrayList<Object> next = new ArrayList<Object>(generators.length);
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java for (Generator<?> gen : generators) {
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java next.add(gen.next());
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java return next.toArray();

You can also find the biggest change sets for the last 500 lines.

hg cruft -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*
|package.*)$" -l 500 -c quickcheck-core/src/main

This prints the revisions number, number of changed lines and commit comment of the change set.

49 41 removed getClassification method from Property interface
moved Classification into quickcheck.property package
177 43 MutationGenerator, CloningMutationGenerator and CloningGenerator added
139 50 fixed generic var arg array problems
5 53 initial check in

Finally, you can find the files with the most lines changed by a single change set (again with the filter and for the 500 oldest lines).

hg cruft -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*
|package.*)$" -l 500 -f quickcheck-core/src/main

This prints the revision number, file name, number of changes and change set commit comment.

176 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/AbstractTreeGenerator.java 27 added tree generator
177 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/CloningGenerator.java 28 MutationGenerator, CloningMutationGenerator and CloningGenerator added
139 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/DefaultFrequencyGenerator.java 36 fixed generic var arg array problems
49 quickcheck-core/src/main/java/net/java/quickcheck/characteristic/Classification.java 41 removed getClassification method from Property interface
moved Classification into quickcheck.property package

Conclusion

Developing a Mercurial extension is relatively easy given Python, the good Mercurial documentation, the good readability of the code and integration test tool. If you’re using Mercurial you should give Mercurial extension development a try. I’ve only recently read into Python again so this is the Python beginner’s version of a Mercurial extension. Help to improve the implementation is always appreciated.

Learning Python and seeing how things are implemented there is fun. Looking at the PEPs and the associated process, they feel much more accessible and open than JSRs. The PEPs are also a track record of the advances the language makes and problems it tries to solve one after the other. There’s stuff in Python that you’ll probably never see in Java like the generator expressions. Everyone who had to replace an internal loop with an iterator will understand that this is not a toy. The language features seem to sum up quite nicely and result in a productive environment. As always, some things are unfamiliar or missing but there’s no perfect platform.