# One-Off detection. Question about Space/Time complexity

8 messages
Open this post in threaded view
|

## One-Off detection. Question about Space/Time complexity

 Problem spec from CareerCup. Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg. -True xyz,xz xyz,xyk xy,xyz -False xyz,xyz xyz,xzy x,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2``````` So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them. What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2` what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why. I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.
Open this post in threaded view
|

## Re: One-Off detection. Question about Space/Time complexity

 I think the space must be O(m+n) since you are using strict Text values. But even if you weren't, you're using s1 and s2 in more than one place so they will be shared, leading to them both being fully in memory at once.The minimum possible space complexity could well be O(1) if using String or lazy Text.The time complexity looks like O(min(m,n)), because that's the cost of `zip`/`filter`, not counting the cost of loading the two Texts into memory in the first place (which will be O(m+n)). If you used String, it'd be O(m+n) since that's the cost of the two calls to `length`. I think the minimum possible time complexity will be O(min(m,n)), and that could include the cost of loading the strings too.On 22 Mar 2017 18:32, "Michael Litchard" <[hidden email]> wrote: Problem spec from CareerCup. Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg. -True xyz,xz xyz,xyk xy,xyz -False xyz,xyz xyz,xzy x,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2``````` So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them. What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2` what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why. I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.
Open this post in threaded view
|

## Re: One-Off detection. Question about Space/Time complexity

 It can be done in one sweep down both strings together - look for the first difference,and then test hypotheses regarding the 5 possible editsoneOff :: String -> String -> BooloneOff [] [] = FalseoneOff [] (_:rest) = null restoneOff (_:rest) [] = null restoneOff s1@(c1:s1') s2@(c2:s2') | c1 == c2  =  oneOff s1' s2'  -- keep looking | otherwise       =    s1' == s2   -- s2 is s1 with c1 deleted or s1 is s2 with c1 inserted        || s2' == s1   -- s1 is s2 with c2 deleted or s2 is s1 with c2 inserted       || s1' == s2'  -- c1 is s1 was replaced by c2 in s2, or v.v.I guess the above should be O(1) in space - it's just a list crawl with testsO(n) in the length of the shortest StringCheers,  AndrewOn 22 Mar 2017, at 18:48, David Turner <[hidden email]> wrote:I think the space must be O(m+n) since you are using strict Text values. But even if you weren't, you're using s1 and s2 in more than one place so they will be shared, leading to them both being fully in memory at once.The minimum possible space complexity could well be O(1) if using String or lazy Text.The time complexity looks like O(min(m,n)), because that's the cost of `zip`/`filter`, not counting the cost of loading the two Texts into memory in the first place (which will be O(m+n)). If you used String, it'd be O(m+n) since that's the cost of the two calls to `length`. I think the minimum possible time complexity will be O(min(m,n)), and that could include the cost of loading the strings too.On 22 Mar 2017 18:32, "Michael Litchard" <[hidden email]> wrote:Problem spec from CareerCup.Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg.-Truexyz,xzxyz,xykxy,xyz-Falsexyz,xyzxyz,xzyx,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2```````So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them.What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2`what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why.I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post _______________________________________________Haskell-Cafe mailing listTo (un)subscribe, modify options or view archives go to:http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.
Open this post in threaded view
|

## Re: One-Off detection. Question about Space/Time complexity

 I also think it can be done in one sweep, but I think this code is O(max(m,n)-min(m,n)): s1' and s2' each appear twice so will be shared.On 23 March 2017 at 08:52, Andrew Butterfield wrote:It can be done in one sweep down both strings together - look for the first difference,and then test hypotheses regarding the 5 possible editsoneOff :: String -> String -> BooloneOff [] [] = FalseoneOff [] (_:rest) = null restoneOff (_:rest) [] = null restoneOff s1@(c1:s1') s2@(c2:s2') | c1 == c2  =  oneOff s1' s2'  -- keep looking | otherwise       =    s1' == s2   -- s2 is s1 with c1 deleted or s1 is s2 with c1 inserted        || s2' == s1   -- s1 is s2 with c2 deleted or s2 is s1 with c2 inserted       || s1' == s2'  -- c1 is s1 was replaced by c2 in s2, or v.v.I guess the above should be O(1) in space - it's just a list crawl with testsO(n) in the length of the shortest StringCheers,  AndrewOn 22 Mar 2017, at 18:48, David Turner <[hidden email]> wrote:I think the space must be O(m+n) since you are using strict Text values. But even if you weren't, you're using s1 and s2 in more than one place so they will be shared, leading to them both being fully in memory at once.The minimum possible space complexity could well be O(1) if using String or lazy Text.The time complexity looks like O(min(m,n)), because that's the cost of `zip`/`filter`, not counting the cost of loading the two Texts into memory in the first place (which will be O(m+n)). If you used String, it'd be O(m+n) since that's the cost of the two calls to `length`. I think the minimum possible time complexity will be O(min(m,n)), and that could include the cost of loading the strings too.On 22 Mar 2017 18:32, "Michael Litchard" <[hidden email]> wrote:Problem spec from CareerCup.Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg.-Truexyz,xzxyz,xykxy,xyz-Falsexyz,xyzxyz,xzyx,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2```````So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them.What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2`what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why.I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post _______________________________________________Haskell-Cafe mailing listTo (un)subscribe, modify options or view archives go to:http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.
Open this post in threaded view
|

## Re: One-Off detection. Question about Space/Time complexity

 Yes - they both might be scanned twice, but O(2n) = O(n) = O(3n), even.The scanniing by  the three uses of == only walk over *all* of the longer list if its length is at most that of the shorterSimply put, if  s1 is much longer than s2, then those == scans will end when they reach the end of the shorter s2Or am I missing something?On 23 Mar 2017, at 15:19, David Turner <[hidden email]> wrote:I also think it can be done in one sweep, but I think this code is O(max(m,n)-min(m,n)): s1' and s2' each appear twice so will be shared.if m == n your timing reduces to O(0) !?On 23 March 2017 at 08:52, Andrew Butterfield wrote:It can be done in one sweep down both strings together - look for the first difference,and then test hypotheses regarding the 5 possible editsoneOff :: String -> String -> BooloneOff [] [] = FalseoneOff [] (_:rest) = null restoneOff (_:rest) [] = null restoneOff s1@(c1:s1') s2@(c2:s2') | c1 == c2  =  oneOff s1' s2'  -- keep looking | otherwise       =    s1' == s2   -- s2 is s1 with c1 deleted or s1 is s2 with c1 inserted        || s2' == s1   -- s1 is s2 with c2 deleted or s2 is s1 with c2 inserted       || s1' == s2'  -- c1 is s1 was replaced by c2 in s2, or v.v.I guess the above should be O(1) in space - it's just a list crawl with testsO(n) in the length of the shortest StringCheers,  AndrewOn 22 Mar 2017, at 18:48, David Turner <[hidden email]> wrote:I think the space must be O(m+n) since you are using strict Text values. But even if you weren't, you're using s1 and s2 in more than one place so they will be shared, leading to them both being fully in memory at once.The minimum possible space complexity could well be O(1) if using String or lazy Text.The time complexity looks like O(min(m,n)), because that's the cost of `zip`/`filter`, not counting the cost of loading the two Texts into memory in the first place (which will be O(m+n)). If you used String, it'd be O(m+n) since that's the cost of the two calls to `length`. I think the minimum possible time complexity will be O(min(m,n)), and that could include the cost of loading the strings too.On 22 Mar 2017 18:32, "Michael Litchard" <[hidden email]> wrote:Problem spec from CareerCup.Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg.-Truexyz,xzxyz,xykxy,xyz-Falsexyz,xyzxyz,xzyx,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2```````So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them.What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2`what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why.I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post _______________________________________________Haskell-Cafe mailing listTo (un)subscribe, modify options or view archives go to:http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.
Open this post in threaded view
|

## Re: One-Off detection. Question about Space/Time complexity

 Sorry, I missed the vital word *space*. It looks time-efficient, but doesn't look O(1) in space.On 23 Mar 2017 16:12, "Andrew Butterfield" <[hidden email]> wrote:Yes - they both might be scanned twice, but O(2n) = O(n) = O(3n), even.The scanniing by  the three uses of == only walk over *all* of the longer list if its length is at most that of the shorterSimply put, if  s1 is much longer than s2, then those == scans will end when they reach the end of the shorter s2Or am I missing something?On 23 Mar 2017, at 15:19, David Turner <[hidden email]> wrote:I also think it can be done in one sweep, but I think this code is O(max(m,n)-min(m,n)): s1' and s2' each appear twice so will be shared.if m == n your timing reduces to O(0) !?On 23 March 2017 at 08:52, Andrew Butterfield wrote:It can be done in one sweep down both strings together - look for the first difference,and then test hypotheses regarding the 5 possible editsoneOff :: String -> String -> BooloneOff [] [] = FalseoneOff [] (_:rest) = null restoneOff (_:rest) [] = null restoneOff s1@(c1:s1') s2@(c2:s2') | c1 == c2  =  oneOff s1' s2'  -- keep looking | otherwise       =    s1' == s2   -- s2 is s1 with c1 deleted or s1 is s2 with c1 inserted        || s2' == s1   -- s1 is s2 with c2 deleted or s2 is s1 with c2 inserted       || s1' == s2'  -- c1 is s1 was replaced by c2 in s2, or v.v.I guess the above should be O(1) in space - it's just a list crawl with testsO(n) in the length of the shortest StringCheers,  AndrewOn 22 Mar 2017, at 18:48, David Turner <[hidden email]> wrote:I think the space must be O(m+n) since you are using strict Text values. But even if you weren't, you're using s1 and s2 in more than one place so they will be shared, leading to them both being fully in memory at once.The minimum possible space complexity could well be O(1) if using String or lazy Text.The time complexity looks like O(min(m,n)), because that's the cost of `zip`/`filter`, not counting the cost of loading the two Texts into memory in the first place (which will be O(m+n)). If you used String, it'd be O(m+n) since that's the cost of the two calls to `length`. I think the minimum possible time complexity will be O(min(m,n)), and that could include the cost of loading the strings too.On 22 Mar 2017 18:32, "Michael Litchard" <[hidden email]> wrote:Problem spec from CareerCup.Given two strings, return boolean True/False if they are only one edit apart. Edit can be insert/delete/update of only one character in the string. Eg.-Truexyz,xzxyz,xykxy,xyz-Falsexyz,xyzxyz,xzyx,xyz ``````module Strings.OneLetter where` import Prelude import qualified Data.Text as T` oneLetter :: T.Text -> T.Text -> Bool oneLetter s1 s2 | s1 == s2 = False | max_l > (min_l + 1) = False | max_l == min_l = diff_size == 1 | otherwise = diff_size == 0 where length_s1 = T.length s1 length_s2 = T.length s2 max_l = max length_s1 length_s2 min_l = min length_s1 length_s2 diff_size = length \$ filter (\(a,b) -> a /= b) zipped zipped = T.zip s1 s2```````So, I used `Text` instead of `String`, hoping I could take advantage of fusion. I have the following questions and my initial attempt to answer them.What is the time complexity of `oneLetter` 0(m+n) where m is the length of `s1` and n is the length of `s2`what is the space complexity of `oneLetter`? I'm thinking due to laziness it's O(1), only two `Char`s are in memory at any one time, or two `Int`s. But I'm hazy on why. If this is wrong, please articulate why. If I'm right, and my reasoning is wrong or incomplete, please say why.I don't think I can improve the time complexity. Am I right? Can the space complexity be improved?What if I changed from `Text` to `String`? I don't think the time complexity changes, but how does this change the space complexity? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post _______________________________________________Haskell-Cafe mailing listTo (un)subscribe, modify options or view archives go to:http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. Andrew ButterfieldSchool of Computer Science & StatisticsTrinity CollegeDublin 2, Ireland _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafeOnly members subscribed via the mailman list are allowed to post.