

Hello,
I am solving a problem of finding the largest prime factor of
600851475143 (Project Euler is great for learning new language!), and
came with the following solution (it uses the most ineffective
approach to finding prime numbers, however is able to factor the above
number in fraction of second):
factors :: Integer > [Integer]
factors n = doFactors n (eratosthenesFilter [1..n])
doFactors n primes
 null newPrimes = []
 otherwise =
let topPrime = head newPrimes in
if topPrime == n
then [topPrime]
else topPrime : (doFactors (n `quot` topPrime) primes)
where
newPrimes = snd $ break (\x > (n `rem` x) == 0) primes
eratosthenesFilter :: [Integer] > [Integer]
eratosthenesFilter [] = []
eratosthenesFilter (num : nums)
 num == 1 = eratosthenesFilter nums
 otherwise = num : (eratosthenesFilter $ doFilter num nums)
where
doFilter num nums = filter (\x > x > num && (x `rem` num) /= 0) nums
What would you do different (including stylistic changes)? What are
the general comments about efficiency (not of the algorithm, but of
the implementation: for example, is it fine to use break at every
invocation of doFactors?) and elegance of the solution?
Thanks and regards,
Sergey


On
http://www.haskell.org/haskellwiki/Prime_numbers"primeFactors" should do what you want (although I don't like the
the pattern matching on "1")
Cheers Christian
Sergey V. Mikhanov wrote:
> Hello,
>
> I am solving a problem of finding the largest prime factor of
> 600851475143 (Project Euler is great for learning new language!), and
> came with the following solution (it uses the most ineffective
> approach to finding prime numbers, however is able to factor the above
> number in fraction of second):
>
> factors :: Integer > [Integer]
>
> factors n = doFactors n (eratosthenesFilter [1..n])
>
> doFactors n primes
>  null newPrimes = []
>  otherwise =
> let topPrime = head newPrimes in
> if topPrime == n
> then [topPrime]
> else topPrime : (doFactors (n `quot` topPrime) primes)
> where
> newPrimes = snd $ break (\x > (n `rem` x) == 0) primes
>
> eratosthenesFilter :: [Integer] > [Integer]
>
> eratosthenesFilter [] = []
> eratosthenesFilter (num : nums)
>  num == 1 = eratosthenesFilter nums
>  otherwise = num : (eratosthenesFilter $ doFilter num nums)
> where
> doFilter num nums = filter (\x > x > num && (x `rem` num) /= 0) nums
>
> What would you do different (including stylistic changes)? What are
> the general comments about efficiency (not of the algorithm, but of
> the implementation: for example, is it fine to use break at every
> invocation of doFactors?) and elegance of the solution?
>
> Thanks and regards,
> Sergey


Sergey V. Mikhanov wrote:
>
> factors :: Integer > [Integer]
> factors n = doFactors n (eratosthenesFilter [1..n])
>
> doFactors n primes
>  null newPrimes = []
>  otherwise =
> let topPrime = head newPrimes in
> if topPrime == n
> then [topPrime]
> else topPrime : (doFactors (n `quot` topPrime) primes)
> where
> newPrimes = snd $ break (\x > (n `rem` x) == 0) primes
>
> eratosthenesFilter :: [Integer] > [Integer]
> eratosthenesFilter [] = []
> eratosthenesFilter (num : nums)
>  num == 1 = eratosthenesFilter nums
>  otherwise = num : (eratosthenesFilter $ doFilter num nums)
> where
> doFilter num nums = filter (\x > x > num && (x `rem` num) /= 0) nums
>
> What would you do different (including stylistic changes)? What are
> the general comments about efficiency (not of the algorithm, but of
> the implementation: for example, is it fine to use break at every
> invocation of doFactors?) and elegance of the solution?
Stylistically, one usually uses shorter variable names in Haskell. Also,
the guards in doFactors are better expressed as pattern matching and
the if can be turned into guards.
factors :: Integer > [Integer]
factors n = go n $ eratosthenes [2..n]
where
go n [] = []
go n (p:ps)
 n `mod` p == 0 = p : go (n `div` p) ps
 otherwise = go n ps
eratosthenes :: [Integer] > [Integer]
eratosthenes [] = []
eratosthenes (p:ps) = p : erathostenes ps'
where
ps' = filter (\x > x > p && (x `mod` p) /= 0) ps
Other than that, efficiency is best understood as algorithmic
efficiency; there are not straightforward "tweaks" that give you the
warm fuzzy feeling of "writing efficient code".
Regards,
apfelmus

http://apfelmus.nfshost.com


2009/3/10 Heinrich Apfelmus < [hidden email]>
> Sergey V. Mikhanov wrote:
> >
>
... some code ...
>
> >
> > What would you do different (including stylistic changes)? What are
> > the general comments about efficiency (not of the algorithm, but of
> > the implementation: for example, is it fine to use break at every
> > invocation of doFactors?) and elegance of the solution?
>
> Stylistically, one usually uses shorter variable names in Haskell.
<beginner rant>
Sometime too short peraphs? At least, this is one of the things that slows
down my understanding of code posted
on this list on or on various haskell tutorial. In any other language I
know, programmers learn to give meaningful names to variable and functions,
so when one reads a program, one can use the name to remember what the
function does. Then one cames to haskell ...
I guess the short names comes from mathematic background, but still ...
haskell is already very succint  even more so
when you use pointfree programming  and if one also uses names like a,e,i (
look at Array function definitions ), ...
</beginner rant>
Rant apart, I notice that in my own excercises I tend to shorten names, so
maybe there is a reason for that.
Nevertheless readability tends to be a big issue in languages used in IT
industry, and my feeling is that haskell
tends to err on the laconic side of the balance.
Out of curiosity, there is any reason why you called the auxiliary function
'go' ?
Ciao

FB
 next part 
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090311/a2dae0f4/attachment0001.htm


>>>>> "Francesco" == Francesco Bochicchio < [hidden email]> writes:
Francesco> Nevertheless readability tends to be a big issue in
Francesco> languages used in IT industry, and my feeling is that
Francesco> haskell tends to err on the laconic side of the
Francesco> balance.
Strongly seconded.

Colin Adams
Preston Lancashire


On 3/10/2009, "Sergey V. Mikhanov" < [hidden email]> wrote:
> Hello,
>
>I am solving a problem of finding the largest prime factor of
>600851475143 (Project Euler is great for learning new language!), and
>came with the following solution (it uses the most ineffective
>approach to finding prime numbers, however is able to factor the above
>number in fraction of second):
>
>factors :: Integer > [Integer]
>
>factors n = doFactors n (eratosthenesFilter [1..n])
>
>doFactors n primes
>  null newPrimes = []
>  otherwise =
> let topPrime = head newPrimes in
> if topPrime == n
> then [topPrime]
> else topPrime : (doFactors (n `quot` topPrime) primes)
> where
> newPrimes = snd $ break (\x > (n `rem` x) == 0) primes
>
>eratosthenesFilter :: [Integer] > [Integer]
>
>eratosthenesFilter [] = []
>eratosthenesFilter (num : nums)
>  num == 1 = eratosthenesFilter nums
>  otherwise = num : (eratosthenesFilter $ doFilter num nums)
> where
> doFilter num nums = filter (\x > x > num && (x `rem` num) /= 0) nums
>
>What would you do different (including stylistic changes)? What are
>the general comments about efficiency (not of the algorithm, but of
>the implementation: for example, is it fine to use break at every
>invocation of doFactors?) and elegance of the solution?
>
>Thanks and regards,
>Sergey
This is my solution to the same problem. I'm just a beginner with
Haskell as well, so just consider this as an alternate solution, not an
ideal solution.
The bottom 2 functions were pulled out of support code that I use for all
my Project Euler solutions, so that's why they seem unnecessarily
generic.
I think the only real advantages of my solution over yours is that I take
advantage of the fact that primes are always odd (except for the number
2) and that the largest prime factor of a number will always be <= half
its value.
main = putStrLn output
output = show result
result = largestPrimeFactor 600851475143
largestPrimeFactor n = last $ primeFactors n
{
 Gets the prime factors of an integer.
}
primeFactors :: (Integral a) => a > [a]
primeFactors n = primeFactorsUsingPrimesList (2:[3, 5 .. n `div` 2]) n
{
 Gets the prime factors of a number. The primes list passed as the
first
 argument is not required to be a list of primes. It is simply
required to be
 a list of values to try to divide the input from. If this list
contains
 nonprime values, they should be ordered. If the list does not
contain all
 of the primes that are divisors of the input value, then the result
will be
 incorrect.
}
primeFactorsUsingPrimesList :: (Integral a) => [a] > a > [a]
primeFactorsUsingPrimesList _ 1 = []
primeFactorsUsingPrimesList (x:xs) n = if n `rem` x == 0
then x : primeFactorsUsingPrimesList (x:xs) (n `div` x)
else primeFactorsUsingPrimesList xs n
primeFactorsUsingPrimesList [] n = [n]


In reply to this post by Francesco Bochicchio
Francesco Bochicchio wrote:
> Heinrich Apfelmus wrote:
>>
>> Stylistically, one usually uses shorter variable names in Haskell.
>
> <beginner rant>
> Sometime too short peraphs? At least, this is one of the things that slows
> down my understanding of code posted
> on this list on or on various haskell tutorial. In any other language I
> know, programmers learn to give meaningful names to variable and functions,
> so when one reads a program, one can use the name to remember what the
> function does. Then one cames to haskell ...
> I guess the short names comes from mathematic background, but still ...
> haskell is already very succint  even more so
> when you use pointfree programming  and if one also uses names like a,e,i (
> look at Array function definitions ), ...
> </beginner rant>
>
> Rant apart, I notice that in my own excercises I tend to shorten names, so
> maybe there is a reason for that.
> Nevertheless readability tends to be a big issue in languages used in IT
> industry, and my feeling is that haskell
> tends to err on the laconic side of the balance.
The goal is of course to make code readable, that's why I recommend
short names. :D
Abstraction is the one driving force for very short names. For example,
take the definition of foldr
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
Since this function is polymorphic, so f , z and the xs can be
anything, using more "descriptive" variable names is simply not
possible; the key point of fold is its generality.
A second, and my main reason for short names, or rather against long
names, is that names should be to the point. None of the names
newPrimes
topPrime
doFactors
doFilter
accurately describe the object they represent. The primes are not "new",
the prime is not "on top". The "do" is a prefix does not carry a meaning
either, it just conveys that doFactors has something to do with
factors . This is best expressed by making doFactors a local
definition in the whereblock of factors .
The name eratosthenesFilter is ok, but since there is no other
eratosthenes around, no meaning is lost by shortening it to simply
eratosthenes . Not to mention that the conventional term is "sieve", not
"filter". The documentation has to elaborate on it anyway.
The generality of the name num hints that a single letter name is
preferable.
The names that I think are great because they are to the point are
factors
primes
> Out of curiosity, there is any reason why you called the auxiliary function
> 'go' ?
Convention. Often, an auxiliary function that does basically the same
thing as the main function factors but with an extra parameter will be
named factors' . The apostrophe has the drawback that it's easy to
forget, so some people now name such auxiliary functions go instead.
Regards,
apfelmus

http://apfelmus.nfshost.com


Greetings!
> A second, and my main reason for short names, or rather against long
> names, is that names should be to the point. None of the names
>
> newPrimes
> topPrime
> doFactors
> doFilter
>
> accurately describe the object they represent. The primes are not "new",
> the prime is not "on top". The "do" is a prefix does not carry a meaning
> either, it just conveys that doFactors has something to do with
> factors . This is best expressed by making doFactors a local
> definition in the whereblock of factors.
>
Those remarks are fine with me! I asked about the stylistic changes because
I came from the, hm, Java world and would like to avoid "writing familiar
things in unfamiliar language". In Java, factors() and doFactors() would be
a perfectly named methods: factors() is public, auxiliary doFactors() is
private and essentially _does_ the factoring.
> > Out of curiosity, there is any reason why you called the auxiliary
> function
> > 'go' ?
>
> Convention. Often, an auxiliary function that does basically the same
> thing as the main function factors but with an extra parameter will be
> named factors' . The apostrophe has the drawback that it's easy to
> forget, so some people now name such auxiliary functions go instead.
>
I think having a local function 'go' in 'factors' is aboslutely plausible:
it is local, there's no ambiguity.
Regards,
Sergey
 next part 
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090313/74eea645/attachment.htm


>>>>> "Heinrich" == Heinrich Apfelmus < [hidden email]> writes:
Heinrich> Abstraction is the one driving force for very short
Heinrich> names. For example, take the definition of foldr
Heinrich> foldr f z [] = z foldr f z (x:xs) = f x (foldr f z
Heinrich> xs)
Heinrich> Since this function is polymorphic, so f , z and the xs
Heinrich> can be anything, using more "descriptive" variable names
Heinrich> is simply not possible; the key point of fold is its
Heinrich> generality.
Wouldn't unit be a better descriptive name than z?

Colin Adams
Preston Lancashire


Colin Paul Adams wrote:
>>>>>> "Heinrich" == Heinrich Apfelmus < [hidden email]> writes:
>
> Heinrich> Abstraction is the one driving force for very short
> Heinrich> names. For example, take the definition of foldr
>
> Heinrich> foldr f z [] = z foldr f z (x:xs) = f x (foldr f z
> Heinrich> xs)
>
> Heinrich> Since this function is polymorphic, so f , z and the xs
> Heinrich> can be anything, using more "descriptive" variable names
> Heinrich> is simply not possible; the key point of fold is its
> Heinrich> generality.
>
> Wouldn't unit be a better descriptive name than z?
I have never heard of a unit in relation to fold , I'm afraid. Monoids
and groups have units, as do physicists and a few other mathematical
structures.
While z is indeed quite often the unit of a monoid, for instance in
sum = foldr (+) 0
product = foldr (*) 1
concat = foldr (++) []
it doesn't have to be the unit of a monoid.
Regards,
apfelmus

http://apfelmus.nfshost.com


2009/3/13 Heinrich Apfelmus < [hidden email]>
> Francesco Bochicchio wrote:
> > Heinrich Apfelmus wrote:
> >>
> >> Stylistically, one usually uses shorter variable names in Haskell.
> >
> > <beginner rant>
> ...
> > </beginner rant>
> >
> > Rant apart, I notice that in my own excercises I tend to shorten names,
> so
> > maybe there is a reason for that.
> > Nevertheless readability tends to be a big issue in languages used in IT
> > industry, and my feeling is that haskell
> > tends to err on the laconic side of the balance.
>
> The goal is of course to make code readable, that's why I recommend
> short names. :D
>
> Abstraction is the one driving force for very short names. For example,
> take the definition of foldr
>
> foldr f z [] = z
> foldr f z (x:xs) = f x (foldr f z xs)
>
> Since this function is polymorphic, so f , z and the xs can be
> anything, using more "descriptive" variable names is simply not
> possible; the key point of fold is its generality.
Ok but one could still hint at their structure or purpose:
foldr function value (x:xs) = function x ( foldr function value xs )
I believe this would give a little more information to the casual reader.
>
>
> A second, and my main reason for short names, or rather against long
> names, is that names should be to the point. None of the names
>
> newPrimes
> topPrime
> doFactors
> doFilter
>
> accurately describe the object they represent. The primes are not "new",
> the prime is not "on top". The "do" is a prefix does not carry a meaning
> either, it just conveys that doFactors has something to do with
> factors . This is best expressed by making doFactors a local
> definition in the whereblock of factors .
I agree that welldocumented shared name conventions are better than
rollyourown.
(x:xs) is one example of such convention, although I tend to adopt slight
variations
like (n:nums) for list of numbers and (ch:chars) for list of characters.
But rollyourown is still better than cryptic.
>
>
> The name eratosthenesFilter is ok, but since there is no other
> eratosthenes around, no meaning is lost by shortening it to simply
> eratosthenes . Not to mention that the conventional term is "sieve", not
> "filter". The documentation has to elaborate on it anyway.
>
> The generality of the name num hints that a single letter name is
> preferable.
>
> The names that I think are great because they are to the point are
>
> factors
> primes
>
> I have some resistance to use nouns for functions. In the imperative world,
nouns are for
variables, verbs are for functions. I know that in pure functional
programming there is not such a thing
as variables, but still I would reserve nouns for function parameters and
bound expressions. Hence if I have a function that
find factors, I would call it findFactors rather than just factors.
One such example of misnaming  from a beginner point of view  is the
length function in prelude: if it was called
count, I believe more beginners would have realized that works by actually
counting the elements of
a list and not by accessing to some already available 'property' of the
list.
>
>
> > Out of curiosity, there is any reason why you called the auxiliary
> function
> > 'go' ?
>
> Convention. Often, an auxiliary function that does basically the same
> thing as the main function factors but with an extra parameter will be
> named factors' . The apostrophe has the drawback that it's easy to
> forget, so some people now name such auxiliary functions go instead.
>
>
I tend to use _ instead of '. Is more visible and keep conveying the idea
that the auxiliary function
is just a slight variation of the main one.
>
>
> Regards,
> apfelmus
>
Ciao

FB
 next part 
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/beginners/attachments/20090314/c0692edd/attachment0001.htm


"Sergey V. Mikhanov" < [hidden email]> wrote:
> I am solving a problem of finding the largest prime factor of
> 600851475143 (Project Euler is great for learning new language!),
> [...]
You may find my little auxilliary library [1] for ProjectEuler.net
problems handy. For the specific problem of factoring an integer, it
uses the wheel factoring method, which uses the idea of Eratosthenes
sieving to filter out most nonprimes with almost no performance loss
compared to the more naive trial division method.
[1] http://hpaste.org/fastcgi/hpaste.fcgi/view?id=2388Greets,
Ertugrul.

nightmare = unsafePerformIO (getWrongWife >>= sex)
http://blog.ertes.de/


In reply to this post by Francesco Bochicchio
Francesco Bochicchio wrote:
> Heinrich Apfelmus wrote:
>>
>> Abstraction is the one driving force for very short names. For example,
>> take the definition of foldr
>>
>> foldr f z [] = z
>> foldr f z (x:xs) = f x (foldr f z xs)
>>
>> Since this function is polymorphic, so f , z and the xs can be
>> anything, using more "descriptive" variable names is simply not
>> possible; the key point of fold is its generality.
>
> Ok but one could still hint at their structure or purpose:
>
> foldr function value (x:xs) = function x ( foldr function value xs )
>
> I believe this would give a little more information to the casual reader.
Sure, though there is already the convention from mathematics that
functions are named f, g and values are denoted by x, y or z .
> I have some resistance to use nouns for functions. In the imperative world,
> nouns are for variables, verbs are for functions.
> I know that in pure functional programming there is not such a thing
> as variables, but still I would reserve nouns for function parameters and
> bound expressions. Hence if I have a function that
> find factors, I would call it findFactors rather than just factors.
>
> One such example of misnaming  from a beginner point of view  is the
> length function in prelude: if it was called
> count, I believe more beginners would have realized that works by actually
> counting the elements of
> a list and not by accessing to some already available 'property' of the
> list.
IMHO, I think that using nouns like length or factors is a good
idea, for they sounds nice when used in compound expressions. Compare
for instance
take (length xs `div` 2) xs
take (count xs `div` 2) xs
The first can be put into pseudoenglish as
"take (the length of xs divided by 2) elements of xs"
while the verb in the second makes it difficult to use the expression in
parenthesis as object to another verb. In contrast, the runtime (O(n)
versus O(1) for say a record field) is rather secondary.
Implicitly adding the preposition "of" makes the nouns flow, i.e.
"factors of n", "length of xs".
>> Convention. Often, an auxiliary function that does basically the same
>> thing as the main function factors but with an extra parameter will be
>> named factors' . The apostrophe has the drawback that it's easy to
>> forget, so some people now name such auxiliary functions go instead.
>>
>>
> I tend to use _ instead of '. Is more visible and keep conveying the idea
> that the auxiliary function is just a slight variation of the main one.
The convention of using an apostrophe for variations comes from
mathematics and it seems that it was put into the Haskell syntax
specifically for this purpose.
In any case, the rationales and style that I presented are how I
interpret the naming conventions from classic papers like
John Hughes. Why functional programming matters.
http://www.cs.chalmers.se/~rjmh/Papers/whyfp.pdf John Hughes. The Design of a PrettyPrinting Library
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.8777Most of these conventions root in mathematics, of course. I like them
for they encourage thinking purely functionally, but these are of course
not the only possibilities.
Regards,
apfelmus

http://apfelmus.nfshost.com

