Disclaimer: I'm not a GHCi expert, and also not that good with GHC core. Now that I've lost my credibility, lets try to understand what happens:
GHCi and CAFs
GHCi retains all evaluated CAFs:
Normally, any evaluation of top-level expressions (otherwise known as CAFs or Constant Applicative Forms) in loaded modules is retained between evaluations.
Now you might wonder why there's such a big difference between both versions. Lets have a look at the core with -ddump-simpl
. Note that you might want to drop -dsuppress-all
when you dump the programs yourself.
Dumps of your programs
Non-exploding version:
? ghc SO.hs -ddump-simpl -fforce-recomp -O0 -dsuppress-all
[1 of 1] Compiling Main ( SO.hs, SO.o )
==================== Tidy Core ====================
Result size of Tidy Core = {terms: 29, types: 28, coercions: 0}
$dShow_rq2
$dShow_rq2 = $fShow[] $fShowChar
Rec {
rList_reI
rList_reI =
ds_dpU ->
case ds_dpU of _ {
[] -> return $fMonadIO ();
: x_aho xs_ahp -> rList_reI xs_ahp
}
end Rec }
main
main =
>>
$fMonadIO
(print $dShow_rq2 (unpackCString# "test"))
(rList_reI (enumFrom $fEnumInt (I# 1)))
main
main = runMainIO main
The important part is the location of [1..]
, almost at the end:
enumFrom $fEnumInt (I# 1))
As you can see, the list isn't a CAF. But what happens if we instead use the exploding version?
Exploding version
? ghc SO.hs -ddump-simpl -fforce-recomp -O0 -dsuppress-all
[1 of 1] Compiling Main ( SO.hs, SO.o )
==================== Tidy Core ====================
Result size of Tidy Core = {terms: 32, types: 31, coercions: 0}
$dShow_rq3
$dShow_rq3 = $fShow[] $fShowChar
Rec {
rList_reI
rList_reI =
ds_dpV ->
case ds_dpV of _ {
[] -> return $fMonadIO ();
: x_ahp xs_ahq -> rList_reI xs_ahq
}
end Rec }
lst_rq4
lst_rq4 = enumFrom $fEnumInt (I# 1)
main
main =
>>=
$fMonadIO
(print $dShow_rq3 (unpackCString# "test"))
( _ -> rList_reI lst_rq4)
main
main = runMainIO main
There's suddenly a new top-level expression, namely lst_rq4
, which generates the list. And as seen before, GHCi retains the evaluations of top-level expressions, so lst_rq4
will also be retained.
Now there is an option to discard the evaluations:
Turning on +r
causes all evaluation of top-level expressions to be discarded after each evaluation (they are still retained during a single evaluation).
But since "they are still retained during a single evaluation" even :set +r
won't help you in this case. Unfortunately I cannot answer why GHC introduces a new top-level expression.
Why does this even happen in optimized code?
The list is still a top-level expression:
main2
main2 = eftInt 1 2147483647
Funny enough, GHC actually doesn't create an infinite list, since Int
is bounded.
How can one get rid of the leak?
In this case you can get rid of it if you place the list in test:
test = do
x <- print "test"
rList [1..]
This will prevent GHC from creating a top-level expression.
However, I can't really give a general advise on this. Unfortunately, my Haskell-fu isn't yet good enough.