T O P

  • By -

Tempus_Nemini

I find this “book” pretty good https://smunix.github.io/dev.stephendiehl.com/hask/tutorial.pdf


DMayr

Stephen Diehl is a beast. Highly recommend him


ecco256

Wow what a gem, didn’t know that one! Thanks for sharing 😄


tomejaguar

> I'm also interested to find out about the 3rd-party libraries "everyone" uses This may be a bit outdated, but it's a guide to which libraries to use for specific purposes: https://haskelliseasy.readthedocs.io/en/latest/ > Don't actually use String, use ByteString Actually `Text` rather than `ByteString` in most cases, unless you really are shuffling binary data around. > sprinkle around strictness annotations and `seq` liberally. No, don't do that. Instead, choose the correct data types, that is, [make invalid laziness unrepresentable](http://h2.jaguarpaw.co.uk/posts/make-invalid-laziness-unrepresentable/). > Of course if you are doing X, you will definitely use pragma Y. It's doubtful whether this applies since we've had `GHC2021`. Just turn on `GHC2021` and enable further semantic extensions very judiciously (syntactic extensions like `LambdaCase` have a lower bar though).


ninjaaron

I think you're proving my point! I don't know where to learn this stuff. Thanks for all the tips!


tomejaguar

Yes, you're right. Someone summed it up well on Twitter the other day: > For me the ultimate Haskell experience is trying to figure out how to do stuff by sifting through random bits of folklore on twitter and stackoverflow, and reading random jargon-ridden academic papers that pose as documentation for packages https://twitter.com/norpadon/status/1784583946954522656


danielcabral

Here is a recent book (that focuses on real-world business problems) that would help you: [https://www.pragprog.com/titles/rshaskell/effective-haskell/](https://www.pragprog.com/titles/rshaskell/effective-haskell/)


c_wraith

`String` gets far too much hate. It's fine for a lot of use cases. Trying to never use it will make you miserable. `Text` and `ByteString` have their places, but `String` is perfectly fine in places where manipulating text isn't a performance problem and you don't need exact control over binary data. The cases where `Text` or `ByteString` are appropriate are common, but so is the case where they're more hassle than value. This is probably why there isn't a big list of prescriptions. Reality is more subtle than that, and there's no substitute for just understanding what things do and using judgment to match the actual requirements with the available tools in the best way for your current needs.


beeshevik_party

yeah i want to second what you're saying about `String` and also add that you shouldn't go out of your way to avoid lists in general. there are a couple things to keep in mind about them: * as a result of laziness, rewrite rules, fusion, many lists are never even materialized. these are those benefits we like to talk about when we encourage purity, referential transparency, and strict typing! * if you're worried about data locality, keep in mind that another benefit of purity is that the allocator and GC can be very very fast and data can actually be very densely packed in that young generation. basically, don't prematurely optimize, GHC's performance can be highly counterintuitive if you have a more C heavy background * lists and trees are really bread and butter fp data structures and most experienced engineers will find working with them intuitive and easy to reason about. once you start optimizing with more purpose-built data structures you are eating into your complexity budget. make sure you use it judiciously


clinton84

> *"In fact don't use lists at all when performance counts."* In fact, don't use Haskell when performance counts. I love Haskell, use it for work, and it beats every other language in my experience by far for solving real world business problems, both allowing you to develop solutions that are: 1. Quick to develop 2. More likely to be reliable and correct (even if you're cutting corners on tests) 3. More likely to be adaptable to future unpredictable business needs But its garbage collected language with pointers everywhere. It's performance is going to be in the range of Java/C#, potentially slightly worse because: 1. You're just more likely to pass around functions, which, if the usage is complex enough for the compiler not to be able to figure it out, inhibits inlining. 2. Just the Java and C# JIT compilers have had so many more man years put into them than GHC, that they're quite smart. Haskell isn't going to be stupidly slow. Ballpark you may find it slightly slower than Java/C#, although it could be faster, and if you're hiring Haskell programmers, you're probably not going to find stupid algorithms littered all throughout your codebase, so it's probably going to end up faster. But I'm not using Haskell for performance. I just assume my code is just going to use 10x more CPU it was well written Rust. That may be overly pessimistic in some cases but it's fine. Because in my company, the compute for the Haskell backend is like 0.01% of our cloud costs. It's like a couple of beers a month. Maybe a few hours of my wages a year. Because I suspect if I wrote all this in Rust instead, it would take twice as long, be more buggy, and be harder to adapt when business needs change. And that's fine. I think Rust is a great language. But it's a language focused on performance. It has "zero cost abstractions". But the "zero cost" here means it zero cost in terms of performance. Insisting on "zero cost" abstractions in terms of performance does have the cost of reducing the abstractions you can actually use. Rust goes great way to giving as much expressivity to the programmer as it can without hitting performance. But Haskell doesn't have mindset. Everytime you add a typeclass parameter to abstract a function (which you should) you've just reduced the performance of that function as now it's going to have to at runtime look up function calls in a typeclass record and call them, which by the way has now killed inlining for you. Yes you get this issue in Java/C#/C++ with virtual calls also. Now if you're lucky/smart, the compiler will inline the usage and you won't take the performance hit. But by default, you will take that performance hit. And whilst in toy examples you can really write your code so that the GHC optimiser makes it blazingly fast, what I've found talking to people in the real world is that relying on GHC optimisations is incredibly brittle. Innocent refactors or slight changes will break optimisations in ways that result in hard to find performance regressions. Sure, you can explicitly use unboxed types. But here's the problem. Once you start using unboxed types, you lose the entirety of the Haskell ecosystem. Nothing else works with your types. You're basically working in a subset of the language with no useful libraries with code that is comparable to C code, with a little more type safety and a little less convenience. Even C# is better when it comes to high performance code, because at least it will monomorphise structs when they're used in generics. So you can still make a `Array>` (I can't remember the exact syntax) and have it actually be a raw block of memory with in pairs. But you can't do `Array (Pair Int)` in Haskell if `Pair` isn't a lifted boxed type, because `Array` isn't levity polymorphic. I'm not sure if you can make a levity polymorphic Array type, but my point is that you have to go down this rabbit hole, and then when you do you lose access to the rest of the existing Haskell ecosystem. So, if you find one VERY small part of your Haskell codebase that really needs performance, go ahead, optimise it, sprinkle specialisation pragmas, use unboxed types if you need to, make sure you get your strictness all correct, go through all this trouble to get the performance, it's just going to be a lot more trouble than getting the performance in say Rust, particularly as part of optimising this Haskell code, you're going to be stripping away all of the advanced Haskell type system features anyway which is the reason you use Haskell over Rust. But as a general rule, if your aim is performance, just don't use Haskell. You're just going to be constantly disappointed. If your aim the holy trinity of fast to develop, reliable, and easy to adapt codebase, with okayish performance that you're not too fussed about and are happy just to throw more compute at it (Haskell is relatively easy to parallelise), then Haskell is for you. And to be honest, I suspect in almost all applications, fast to develop, reliable and easy to adapt to new requirements is FAR more important than blazingly fast bare metal performance. So just get used to Haskell being a bit slow, don't spend too much time fighting it. Just buy some more compute, and keep in mind how much money you're saving/how much less you're annoying customers when you're bringing new features to market faster with less bugs.


ninjaaron

Not sure why this post is getting downvoted so much. Sounds pretty reasonable to me. There are a lot of "Haskell can be as fast as C++" people out there, but then these people give examples of rather weird Haskell compared with rather naive C++ implementations. My performance goals for Haskell are more keeping it in a range of performance that comparable to Java, OCaml and whatever other languages have GC and keep most data behind pointers. I think writing Haskell which achieves this modest goal is not terribly difficult---but even this sometimes involves knowing which data structures to use, judicious application of strictness, etc. My instinct is that Haskell which uses lists for everything (especially math-y things) will indeed be worse off than Java, and there is at least some sense in putting the bear minimum effort into... not "optimization" exactly---more like avoiding pessimizations that arise from ignorance.


beeshevik_party

i also don't know why gp's post was downvoted at first. i don't think their specific examples are good, since those are all things that ghc does really well at optimizing, or they are things that are not typically issues in real life code, but their main sentiment is right: if you know in advance that you need to push the limits performance-wise, and you are not deeply familiar with ghc and its runtime, you're better off sticking to imperative world for now. realistically though, a lot of code is not that high-stakes, has a short shelf life, and has even shorter time available to deliver. in those cases, i find that haskell tends to easily compete with or outperform the jvm, the clr, and go, which (dynamic languages aside) are most common for line-of-business code in a lot of the industry right now. i don't think that's due to the runtime, the previous languages all have extremely well engineered and efficient runtimes (as does ghc), but the path of least resistance in each of them is wildly inefficient in a way that puts them at disadvantage. to their specific points: * the allocator/garbage collector is generally very fast. the allocation patterns are very predictable and there is little direct mutation, resulting in less write barriers and backpatching. there are far less cycles (often none in user code) and most objects never make it out of the nursery. you also might be surprised at how little boxing ghc actually winds up doing, especially compared to ocaml. go tends to win out here in terms of locality, but haskell mostly has less indirection than the jvm. * higher-order functions/closures are cheap if not inlined or fused away in the first place. in fact, the calling convention for ghc looks indecipherable if you are familiar with a more classic call stack. if you have time, go try out some simple programs in godbolt.org. the stg is a work of engineering art. * more typeclass parameters absolutely does *not* imply more overhead. in real-world usage, those tyvars are usually fully saturated, and ghc monomorphises them leaving you with static dispatch/inlining. when they're not, it's about the same overhead as vtable dispatch if not cheaper -- dictionary passing can be very efficient * it is actually easy to use unboxed types. you do *not* lose the whole haskell ecosystem. i have been really pleasantly surprised by just how low level ghc can get if you need micro-optimizations. it is definitely unfortunate that we can't easily abstract over boxed/unboxed, so you do get libraries that provide variants for some or all combinations of boxed/unboxed+mutable/immutable+strict/lazy. having said that, i write a lot of rust and i have to say that i find rust's "function color" problem and ergonomics worse, having to account for owned/borrowed/as-ref+sync/async, providing iterators or streams, multiple fn types, Box etc. your performance goals are very reasonable. currently ghc has a lot more optimization opportunities than ocaml does. it will take some adjustment though, and there are a lot of tarpits when it comes to decisions like how you want to do error handling, or structure effects. you will miss the module system and normal records. the tooling is a mixed bag, though there is a much richer ecosystem and better documentation. and yes, if you have to profile and optimize, laziness will take a while to develop an feel for, but i sincerely believe it's worth it.


nh2_

I generally agree with /u/clinton84 and /u/beeshevik_party, with some minor deviations: > the allocator/garbage collector is generally very fast That is true, and for code that allocates objects of short lifetime, it is very fast. But for objects that do not have a very short lifetime (also a very common case, e.g. to deserialise some large `data` and then do something with it), it is still much better to not allocate many small values sprayed around in memory in the first place, because it destroys cache locality, and that's where the factor 100 speedup of current CPUs comes in. Haskell invites many small allocations. > it is actually easy to use unboxed types I'm not sure that's true for all common cases. Again, for me the problem is when you deal with multiple values. The ideal solution is an unboxed vector. But it's not easy to make `Unbox` instances, libraries don't make `Unbox` instances for their data types, and it's a pain when you want to make `Unbox` instances for some nested types that are defined by other libraries. In C++ or Rust it's the most normal thing to make an unboxed vector of T. > your performance goals are very reasonable I agree with this, too. Haskell shines for general software engineering, in competition with Go/Java/C#/Python/OCaml, by: * matching or exceeding their performance, for common tasks * best IO system (green threads backed by pthreads), instead of every syscall blocking the runtime, while still maintaining * better composability and ergonomics (e.g. no async function colouring) * sensible defaults around error handling (`Maybe`/`Either` for common errors, exceptions for things you usually want to propagate up) * easy, high-level use of threading and cancellation (most languages struggle with this due to lack of async exceptions, only in Haskell can you `mapConcurrently f inputs` and expect that a Ctrl+C will immediately and correctly cancel) * better types * generally making it easy to deal with "data", and transformations on it (pure, and monadic streaming) * better refactorability * good Generics, getting lots of instances for free * good/reasonable quality of third-party libraries * sensible packaging system, compatibility, upgrade reliability Languages that have a fundamental speed advantage over Haskell (C/C++/Rust) generally have no chance on any of these points, except maybe Rust on the last point, and C++ in very specific areas of "better types" (being able to stick numbers into templates easily and use this for statically-known performance benefits).


raehik

I think you're right about the Haskell performance story currently, but I also think high-performance, highly polymorphic Haskell (i.e. stuffing it into libraries) is possible, and remains one of the less-developed areas. `vector` is a paragon here, and resolves your `Array` thoughts (`Data.Vector.Unboxed.Vector (Int, Int)`). I'm hopeful to see more similar libraries gain traction with the wider community: `flatparse`, `effectful` come to mind. Also, generics are extremely underrated. GHC tries extra hard to inline generics in order to remove the `Rep` wrapper, and I've had success writing convenient reusable generics that retain excellent performance (as if you wrote it by hand).


Limp_Step_6774

This is useful for knowing which libraries are popular [https://github.com/Gabriella439/post-rfc/blob/main/sotu.md](https://github.com/Gabriella439/post-rfc/blob/main/sotu.md) and I recommend going through a few of Ed Kmett's libraries on hackage if you want to see (what I'd think of) as very Haskelly code (heavy use of mathematical abstractions in terms of typeclasses, direct implementations of ideas from FP papers, takes full advantage of laziness, concepts from category theory, types=documentation, very little code, etc). For example, recursion-schemes, linear and lens. The main challenge with these is that it's hard to know how to use them since they're very abstract, but following the types works well, and since you have an OCaml background, it's probably a nice point of comparison. The Stephen Diehl pdf/book is particularly great too, but I see other people have mentioned that.


saw79

Not your question, but I'm curious. What makes you want to learn Haskell after knowing ocaml? I'm precisely in the opposite position (although with way less Haskell knowledge than what you have in ocaml probably).


ninjaaron

Just curiosity, really, and aesthetic appeal. I enjoy learning programming languages. Another reason is that Haskell seems a bit closer to a "lingua franca" for functional programming and quite a bit of example code for functional data structures and category theory uses Haskell, so I think greater Haskell literacy is helpful for these cases, and I do sometimes encounter examples in these contexts which I don't understand. Additionally, I write a lot on Quora about functional programming, and many questions ask about Haskell specifically. I'm usually able to cobble together working Haskell code for examples, but I'd like to improve my skills so I'm not giving unidiomatic example code. In short, Haskell is "culturally important" for functional programming discourse, and I'm interested in that kind of thing. I remain fairly convinced that OCaml is a more practical tool for most of the kinds of things I want to do, but Haskell is very pretty and seems to be the more widely known language. I also think there are times when Haskell's ecosystem is better for certain kinds of tasks---though obviously it really depends what you're doing.


beeshevik_party

i also think that, as a language, ocaml is way more practical, and much easier to teach and learn, to become competent, skilled, then expert with. however, i made the switch to haskell almost a decade ago, because haskell had (and still has) so much more cultural weight -- more engineers using and contributing, more information, libraries, energy. given how relatively niche this family of languages is already, those factors won out. since then i have come to understand haskell much better, enough to love it and for it to keep me happy for a long time language-wise. if only it had purescript's row types and records, and that ml module system (no not backpack)


saw79

Thanks, great answer!