Friday, November 5, 2010

jQuery and Node.js

jQuery makes it really easy to select certain parts not only of a HTML page but also of an XML document. So if you just want to parse some (and not too big) XML file or do not care too much ab the speed of your parser a combination of node.js and jquery can help a lot and make some stuff really less painful.

So what is needed (ubuntu linux):
  • Install node.js on ubuntu:
  • Download jquery (you could also include an online version - google is your friend): I use jquery-1.4.2.min.js (1.4.3 did not work for me)
Then try this example:

Type into a terminal:
> node helloJQueryNode.js

It should write to the console:
> Hello World, It works! Really!

Now a more complex example:

again Type into a terminal:
> node xmlTest.js

It should write to the console:
> 229.29

Pretty simple!! So what does it do? So you have got an XML file. Node js fetches the file and jQuery is used for parsing. Well using jQuery selector syntax to extract the stuff you are looking  for from an XML file is typically a lot easier than for example using XPath.
Okay, as already mentioned for big XML files and if you have lots of files to crawl *I* would use more likely Java (apache httpclient + a fast STAX-parser such as Woodstox). For very simple tasks Node.js + jQuery is really a good choice.

Some more about Node.js:

Friday, October 29, 2010

How to convert SVG files to pdf with Inkscape

This short post concerns windows. I usually have a folder with my standard batch files in a folder which is on the classpath (e.g. C:\Users\me\work\tools\bat\ ).

In this folder I have got to batch files:

  • inkscape.bat containing this line: "C:\Program Files\Inkscape\inkscape.exe" %*
    • don’t forget the qoutes if you white spaces in your path to inkscape
    • %* mean hand over all provided parameters/ arguments

  • svg2pdf containing this line: inkscape "%CD%\%1" -D --export-pdf "%CD%\%1.pdf"
    • %1 stands for the first provided parameter – you could modify this to check if there is a second parameter given and use that one as output file
    • %CD% stands for current directory
    • -D stands for export only the drawing area.0 The allowed parameters for inkscape can be found in the manual.

In an command window you navigate to the folder desired (in windows xp navigate to folder and press [Windows Key] + [r] and enter cmd – in windows 7 navigate to the parent folder of your desired folder and press [Shift] + your right mouse key and select “Open command window here”) and enter for example: svg2pdf foo.svg

One cool thing is with svgs it is easy to change the font for the used text. You simple replace the font name. Here is a simple python script for that and that additionally calls inkscape after that to convert the file – of course this works only if you have got python installed:


One other cool thing about inkscape is: If you have got a tool such as UMLet which can export to svg but renders fonts as lines (pdf export is not an option – you want to replace the used standard font and you can not afford an acrobat licence), you can not replace the fonts in the svg. You can use Inkscape to convert the pdf to an svg via a simple command: inkscape foo.pdf --export-plain-svg foo.svg or you can create a batch file analogue to the one above (e.g. svg2svg.bat):  inkscape "%CD%\%1" -D --export-plain-svg "%CD%\%1.svg". Having a svg now, you can replace the font in the svg and then export the svg to a pdf.

Note: I had sometimes problems with the newest versions of inkscape rendering fonts as lines in the resulting pdf and therefore losing the plain text. With version 0.46 it keeps the text as text. That might have to do with additional parameters or whatever. I do not care. Keep it simple make it work…