Quantcast

Re: Converting wiki pages into pdf

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Converting wiki pages into pdf

Dirk Hünniger
I invested an enormous amount of time into this problem. Accordingly I
got a very well working solution.

http://de.wikibooks.org/wiki/Benutzer:Dirk_Huenniger/wb2pdf
http://en.wikibooks.org/wiki/File:Haskell.pdf

I am happy If you find it useful.
Yours Dirk Hünniger

> Thu, 08 Sep 2011 05:36:44 -0700
> Hello all
> I am trying to write a Haskell program which download html pages from
> wikipedia   including images and convert them into pdf . I wrote a
> small script
>
> import Network.HTTP
> import Data.Maybe
> import Data.List
>
> main = do
>          x<- getLine
>          htmlpage<-  getResponseBody =<<  simpleHTTP ( getRequest x ) --
> open url
>          --print.words $ htmlpage
>          let ind_1 = fromJust . ( \n ->  findIndex ( n `isPrefixOf`) .
> tails $ htmlpage ) $ "<!-- content -->"
>              ind_2 = fromJust . ( \n ->  findIndex ( n `isPrefixOf`) .
> tails $ htmlpage ) $ "<!-- /content -->"
>              tmphtml = drop ind_1 $ take ind_2  htmlpage
>          writeFile "down.html" tmphtml
>
> and its working fine except some symbols are not rendering as it
> should be. Could some one please suggest me how to accomplish this
> task.
>
> Thank you
> Mukesh Tiwari
>

_______________________________________________
Haskell-Cafe mailing list
[hidden email]
http://www.haskell.org/mailman/listinfo/haskell-cafe
Loading...