|
I want to write a function whose behavior is as follows:
foo "string1\nstring2\r\nstring3\nstring4" = ["string1", "string2\r\nstring3", "string4"] Note the sequence "\r\n", which is ignored. How can I do this? _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
On Mon, Jan 02, 2012 at 12:44:23PM +0300, max wrote:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe A short yet requiring regex solution: > import Text.Regex.PCRE > match (makeRegex "(?:[^\r\n]+|\r\n)+" :: Regex) "b\nc\r\n\n\r\n\nd" :: [[String]] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
Doesn't the function "lines" handle different line-endings?
(In the Prelude and in Data.List) If not, doing this with parsec would be easy (yet maybe slightly overkill...)
2012/1/2 max <[hidden email]> I want to write a function whose behavior is as follows: _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
В Mon, 2 Jan 2012 10:45:18 +0100
Yves Parès <[hidden email]> пишет: Prelude> lines "string1\nstring2\r\nstring3\nstring4" ["string1","string2\r","string3","string4"] > Doesn't the function "lines" handle different line-endings? > (In the Prelude and in Data.List) > > If not, doing this with parsec would be easy (yet maybe slightly > overkill...) > > > 2012/1/2 max <[hidden email]> > > > I want to write a function whose behavior is as follows: > > > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > > "string2\r\nstring3", "string4"] > > > > Note the sequence "\r\n", which is ignored. How can I do this? > > > > _______________________________________________ > > Haskell-Cafe mailing list > > [hidden email] > > http://www.haskell.org/mailman/listinfo/haskell-cafe > > _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Yves Parès
> Doesn't the function "lines" handle different line-endings?
> (In the Prelude and in Data.List) It does not ignore "\r\n". Cheers, Simon _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
Am 02.01.2012 10:44, schrieb max:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? replace the sequence by something unique first, i.e. a single "\r" (and revert this change later). (Replacing a single character is easier using concatMap). HTH Christian -- | replace first (non-empty) sublist with second one in third -- argument list replace :: Eq a => [a] -> [a] -> [a] -> [a] replace sl r = case sl of [] -> error "replace: empty list" _ -> concat . unfoldr (\ l -> case l of [] -> Nothing hd : tl -> Just $ case stripPrefix sl l of Nothing -> ([hd], tl) Just rt -> (r, rt)) _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
On 02/01/2012 09:44, max wrote:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? Doing it probably the hard way (and getting it wrong) looks like the following... -- Function to accept (normally) a single character. Special-cases -- \r\n. Refuses to accept \n. Result is either an empty list, or -- an (accepted, remaining) pair. parseTok :: String -> [(String, String)] parseTok "" = [] parseTok (c1:c2:cs) | ((c1 == '\r') && (c2 == '\n')) = [(c1:c2:[], cs)] parseTok (c:cs) | (c /= '\n') = [(c:[], cs)] | True = [] -- Accept a sequence of those (mostly single) characters parseItem :: String -> [(String, String)] parseItem "" = [("","")] parseItem cs = [(j1s ++ j2s, k2s) | (j1s,k1s) <- parseTok cs , (j2s,k2s) <- parseItem k1s ] -- Accept a whole list of strings parseAll :: String -> [([String], String)] parseAll [] = [([],"")] parseAll cs = [(j1s:j2s,k2s) | (j1s,k1s) <- parseItem cs , (j2s,k2s) <- parseAll k1s ] -- Get the first valid result, which should have consumed the -- whole string but this isn't checked. No check for existence either. parse :: String -> [String] parse cs = fst (head (parseAll cs)) I got it wrong in that this never consumes the \n between items, so it'll all go horribly wrong. There's a good chance there's a typo or two as well. The basic idea should be clear, though - maybe I should fix it but I've got some other things to do at the moment. Think of the \n as a separator, or as a prefix to every "item" but the first. Alternatively, treat it as a prefix to *every* item, and artificially add an initial one to the string in the top-level parse function. The use tail etc to remove that from the first item. See http://channel9.msdn.com/Tags/haskell - there's a series of 13 videos by Dr. Erik Meijer. The eighth in the series covers this basic technique - it calls them monadic and uses the do notation and that confused me slightly at first, it's the *list* type which is monadic in this case and (as you can see) I prefer to use list comprehensions rather than do notation. There may be a simpler way, though - there's still a fair bit of Haskell and its ecosystem I need to figure out. There's a tool called alex, for instance, but I've not used it. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
On Mon, Jan 2, 2012 at 3:14 PM, max <[hidden email]> wrote:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? Here's a simple way (may not be the most efficient) - import Data.List (isSuffixOf) split = reverse . foldl f [] . lines where f [] w = [w] f (x:xs) w = if "\r" `isSuffixOf` x then ((x++"\n"++w):xs) else (w:x:xs) Testing - ghci> split "ab\r\ncd\nefgh\nhijk" ["ab\r\ncd","efgh","hijk"] -- Anupam _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
On Mon, Jan 02, 2012 at 12:44:23PM +0300, max wrote:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe unixLines :: String -> [String] unixLines xs = reverse . map reverse $ go xs "" [] where go [] l ls = l:ls go ('\r':'\n':xs) l ls = go xs ('\n':'\r':l) ls go ('\n':xs) l ls = go xs "" (l:ls) go (x:xs) l ls = go xs (x:l) ls _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
Okay, so it doesn't handle different line-endings.
I have a more general solution (statefulSplit) http://hpaste.org/55980 I cannot test it as I don't have an interpreter at hand, but if someone has, I'd be glad to have comments. (It might be more readable by using the State monad) 2012/1/2 max <[hidden email]> В Mon, 2 Jan 2012 10:45:18 +0100 _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
max <[hidden email]> writes:
> I want to write a function whose behavior is as follows: > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > "string2\r\nstring3", "string4"] > > Note the sequence "\r\n", which is ignored. How can I do this? cabal install split then do something like import Data.List (groupBy) import Data.List.Split (splitOn) rn '\r' '\n' = True rn _ _ = False required_function = fmap concat . splitOn ["\n"] . groupBy rn (though that might be an abuse of groupBy) -- Jón Fairbairn [hidden email] _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
В Mon, 02 Jan 2012 11:12:49 +0000
Jon Fairbairn <[hidden email]> пишет: > max <[hidden email]> writes: > > > I want to write a function whose behavior is as follows: > > > > foo "string1\nstring2\r\nstring3\nstring4" = ["string1", > > "string2\r\nstring3", "string4"] > > > > Note the sequence "\r\n", which is ignored. How can I do this? > > cabal install split > > then do something like > > import Data.List (groupBy) > import Data.List.Split (splitOn) > > rn '\r' '\n' = True > rn _ _ = False > > required_function = fmap concat . splitOn ["\n"] . groupBy rn > > (though that might be an abuse of groupBy) > This is the simplest solution of the proposed, in my opinion. Thank you very much. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
On Mon, Jan 2, 2012 at 10:12 AM, max <[hidden email]> wrote:
> This is the simplest solution of the proposed, in my opinion. Thank you > very much. Better yet, don't use String and use Text. Then you just need T.splitOn "\r\n" [1]. Cheers, [1] http://hackage.haskell.org/packages/archive/text/0.11.1.12/doc/html/Data-Text.html#v:splitOn -- Felipe. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
On Mon, Jan 2, 2012 at 5:52 PM, Felipe Almeida Lessa
<[hidden email]> wrote: > On Mon, Jan 2, 2012 at 10:12 AM, max <[hidden email]> wrote: >> This is the simplest solution of the proposed, in my opinion. Thank you >> very much. > > Better yet, don't use String and use Text. Then you just need > T.splitOn "\r\n" [1]. That is actually the opposite of what the OP wants, however it's interesting that Text has a function like that and not the String functions in the standard library. -- Anupam _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
String is really for small strings. Text is more efficent and also has
more functionality, including most, if not all, of the functions defined for String. On Mon, Jan 2, 2012 at 3:12 PM, Anupam Jain <[hidden email]> wrote: > On Mon, Jan 2, 2012 at 5:52 PM, Felipe Almeida Lessa > <[hidden email]> wrote: >> On Mon, Jan 2, 2012 at 10:12 AM, max <[hidden email]> wrote: >>> This is the simplest solution of the proposed, in my opinion. Thank you >>> very much. >> >> Better yet, don't use String and use Text. Then you just need >> T.splitOn "\r\n" [1]. > > That is actually the opposite of what the OP wants, however it's > interesting that Text has a function like that and not the String > functions in the standard > library. > > -- Anupam > > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe -- Markus Läll _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Комар Максим
If you're interested in learning parsec, RWH covered this topic in depth in Chapter 16, Choices and Errors: http://book.realworldhaskell.org/read/using-parsec.html.
On Mon, Jan 2, 2012 at 3:44 AM, max <[hidden email]> wrote: I want to write a function whose behavior is as follows: _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Jon Fairbairn
On 02/01/2012 11:12, Jon Fairbairn wrote:
> max<[hidden email]> writes: > >> I want to write a function whose behavior is as follows: >> >> foo "string1\nstring2\r\nstring3\nstring4" = ["string1", >> "string2\r\nstring3", "string4"] >> >> Note the sequence "\r\n", which is ignored. How can I do this? > cabal install split > > then do something like > > import Data.List (groupBy) > import Data.List.Split (splitOn) > > rn '\r' '\n' = True > rn _ _ = False > > required_function = fmap concat . splitOn ["\n"] . groupBy rn > > (though that might be an abuse of groupBy) > has (I think) a subtle bug as a result. I was inspired by this to try some other groupBy stuff, and it didn't work. After scratching my head a bit, I tried the following... Prelude> import Data.List Prelude Data.List> groupBy (<) [1,2,3,2,1,2,3,2,1] [[1,2,3,2],[1,2,3,2],[1]] That wasn't exactly the result I was expecting :-( Explanation (best guess) - the function passed to groupBy, according to the docs, is meant to test whether two values are 'equal'. I'm guessing the assumption is that the function will effectively treat values as belonging to equivalence classes. That implies some rules such as... (a == a) reflexivity : (a == b) => (b == a) transitivity : (a == b) && (b == c) => (a == c) I'm not quite certain I got those names right, and I can't remember the name of the first rule at all, sorry. The third rule is probably to blame here. By the rules, groupBy doesn't need to compare adjacent items. When it starts a new group, it seems to always use the first item in that new group until it finds a mismatch. In my test, that means it's always comparing with 1 - the second 2 is included in each group because although (3 < 2) is False, groupBy isn't testing that - it's testing (1 < 2). In the context of this \r\n test function, this behaviour will I guess result in \r\n\n being combined into one group. The second \n will therefore not be seen as a valid splitting point. Personally, I think this is a tad disappointing. Given that groupBy cannot check or enforce that it's test respects equivalence classes, it should ideally give results that make as much sense as possible either way. That said, even if the test was always given adjacent elements, there's still room for a different order of processing the list (left-to-right or right-to-left) to give different results - and in any case, maybe it's more efficient the way it is. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
On 04/01/2012 16:47, Steve Horne wrote:
> > (a == a) > reflexivity : (a == b) => (b == a) > transitivity : (a == b) && (b == c) => (a == c) > Oops - that's... reflexivity : (a == a) symmetry : (a == b) => (b == a) transitivity : (a == b) && (b == c) => (a == c) An equivalence relation is a relation that meets all these conditions. _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Steve Horne
Am 04.01.2012 17:47, schrieb Steve Horne:
> On 02/01/2012 11:12, Jon Fairbairn wrote: >> max<[hidden email]> writes: >> >>> I want to write a function whose behavior is as follows: >>> >>> foo "string1\nstring2\r\nstring3\nstring4" = ["string1", >>> "string2\r\nstring3", "string4"] >>> >>> Note the sequence "\r\n", which is ignored. How can I do this? Why do you have these (unhealthy) different kinds of line breaks (Unix and Windows style) in your string in the first place? I hope, not by something calling "unlines" (or intercalate "\n") earlier. Cheers Christian _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
|
In reply to this post by Steve Horne
Le Wed, 04 Jan 2012 17:49:15 +0000,
Steve Horne <[hidden email]> a écrit : > On 04/01/2012 16:47, Steve Horne wrote: > > > > (a == a) > > reflexivity : (a == b) => (b == a) > > transitivity : (a == b) && (b == c) => (a == c) > > > Oops - that's... > > reflexivity : (a == a) > symmetry : (a == b) => (b == a) > transitivity : (a == b) && (b == c) => (a == c) > > An equivalence relation is a relation that meets all these conditions. > > I prefer to use "transymmetry" (although I guess it is not a regular word): reflexivity: a ≃ a transymmetry: ∀ a b. b≃a ⇒ ∀ c. c≃a ⇒ b≃c so I only have 2 rules. transymmetry is trivially derived from transitivity and symmetry. symmetry is trivially derived from reflexivity and transymmetry. transitivity is trivially derived from symmetry and transymmetry (and thus from transymmetry and reflexivity) > _______________________________________________ > Haskell-Cafe mailing list > [hidden email] > http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list [hidden email] http://www.haskell.org/mailman/listinfo/haskell-cafe |
| Powered by Nabble | Edit this page |
