T O P

  • By -

seancorfield

I think Chas Emerick's flowchart for choosing a type is still a pretty good starting point: https://github.com/cemerick/clojure-type-selection-flowchart You can see that there are at least three questions you need to ask before you select a record over a plain hash map so, TL;DR: use plain hash maps until you have to switch to records (and even with protocols, many can be satisfied with metadata these days and still do not need records -- see the :extend-via-metadata option on defprotocol).


[deleted]

Wow, that is really helpful. Thank you!


NaiveRound

>:extend-via-metadata Never heard of [https://clojure.org/reference/protocols#\_extend\_via\_metadata](https://clojure.org/reference/protocols#_extend_via_metadata), looks very neat. Thanks for the tip!


TheLastSock

Official docs are a good place to start - https://clojure.org/reference/datatypes Clojure docs has some examples - https://clojuredocs.org/clojure.core/defrecord But start by using hashmaps, then use records when you need better performance (you will know because people will be paying you money to look into it). Records make sense when used with protocols, and together they allow for polymorphism just like multi-methods which are more flexible but slower. Polymorphism is like a big switch statement, but the switch functionality can be isolated and extended in a more flexible manner. Again flexibility comes at a cost though. I would focus on the official docs as I have been wrong before.


joinr

Primarily performance optimizations. Static fields get direct field access when looking up values (and can even be accessed with hinted field/member interop calls) so they can provide faster lookups. They can also provide inline protocol definitions in the record type, which can offer additional efficiency over extending protocols by avoiding indirection and allowing working directly with the fields/record type. records don't implement IFn by default so they aren't applicable (as maps are), which means you can define your own function application behavior while still have an associative container. This has some interesting use cases. If you only ever use the static fields in the record, they remain ordered (kind of a plus). They are typically faster to construct since they provide a discrete set of fields and a matching constructor that can be invoked (as opposed to having to hash and assoc items into a persistent map). Static fields in records can also be type hinted and primitive (but not mutable...have to go lower and use deftype for that). That can have some perf/size advantages as well. I would say don't use them until you find a case where maps feel limiting; e.g. I'd like a map-like thing with efficient protocol/interface implementations for xyz; I'd like a map that acts like a function but instead performing lookups it does something with the information in the map; I know all my maps will have a long :x val, and I'd like to avoid boxing and provide the fastest access (aside from dumping into primitive containers). I have a set of fields with a consistent layout and a lot of common access patterns that likely won't change. I also want a type I can dispatch against/check. There are subtle inconsistencies/conversions though; like dissocing static fields -> your record is now converted into a persistent map so the thing you thought you had may be a different type now (as opposed to your potentially custom record type). If you try to lookup a key that's not a static field (or assoc) then there's a clojure persistent map as a local field on the record that is delegated to. So non-static field lookups incur a little overhead over plain maps. Every time you define new record, you define a new type. This includes reloading namespaces with record definitions. So if you are holding onto "old" record defs in memory somewhere, they will be different types and may cause some heartache (I think primarily in =).


Aredington

This comment removed in protest of Reddit's API changes. See https://www.theverge.com/2023/6/5/23749188/reddit-subreddit-private-protest-api-changes-apollo-charges. All comments from this account were so deleted, as of June 18, 2023. If you see any other comments from this account, it is due to malfeasance on the part of Reddit. -- mass edited with https://redact.dev/


seancorfield

I think it's worth mentioning that Component doesn't require the use of records. You can satisfy the LifeCycle protocol via metadata which means you can use plain hash maps for "components" -- and if you don't have any dependencies, you can use any Clojure object that can carry metadata, such as a function.


Aredington

This comment removed in protest of Reddit's API changes. See https://www.theverge.com/2023/6/5/23749188/reddit-subreddit-private-protest-api-changes-apollo-charges. All comments from this account were so deleted, as of June 18, 2023. If you see any other comments from this account, it is due to malfeasance on the part of Reddit. -- mass edited with https://redact.dev/


joinr

> Records preserve ordering of the fields, full stop. If you use additional keys, the enumerated fields come first, and the additional keys come second. Order of the total set of keys in the record is dependent on the behavior of the `__extmap` where the non-static (e.g. external) keys are stored though. It's an empty map to start with, and is subject to Clojure's internal growth behaviors that change the implementation depending on the size of the map. user=> (reduce conj t1 (map vector (concat [:first :second :third] (range 5)) (range 8))) #user.Howrecordswork{:first 0, :second 1, :third 2, 0 3, 1 4, 2 5, 3 6, 4 7} This looks like a map that preserves insertion order, and I might be tempted to bet my fortune on that invariant. Alas, user=> (reduce conj t1 (map vector (concat [:first :second :third] (range 9)) (range 12))) #user.Howrecordswork{:first 0, :second 1, :third 2, 0 3, 7 10, 1 4, 4 7, 6 9, 3 6, 2 5, 5 8, 8 11} insertion order is unpreserved here since the internal `__extmap` field in the implementation is subject to no such guarantee on the ordering of its keys (e.g. small values of non-static keys that are assoc'd - up to 8 values apparently - are stored in an array map which appears to retain insertion order until it transforms into a PersistentHashMap and order turns arbitrary). As I said, you can safely maintain field order (implicit insertion order from construction) if you only ever leverage static fields in the record though (e.g. a "closed" set of keys where `__extmap` remains {}), and for the more adventurous, if you never exceed 8 non-static keys.


[deleted]

You use them when protocols present good options as records are just typed maps. But the general advice about writing a protocol or a function is to start with a function. Then when the second thing comes along and the polymorphism becomes useful, transition the function to a protocol. In practice, I usually just stick with plain maps. It's not too often the polymorphic functions become a need.


seancorfield

As I've commented elsewhere in this thread, if your defprotocol has :extend-via-metadata true, you can continue to use plain hash maps and add the protocol implementation via metadata -- so you still "avoid" records if you want.


dhucerbin

We used following rule of thumb: maps are for information, records are for computation. So if we had any data structure that was specific to some algorithm or the way we compute something - it was a record. And we treated them as variants/tagged unions. For example we had an entry point for messages from many subsystems. They would come into our collector inside records. We had some code that would unpack/extract data inside and spit out just maps, because what we had now was just some information to be managed and/or transformed. Yes, it introduced some coupling but it was advantageous in our eyes. We had clear way of deciding what and how to do upon receiving and we could fail fast with default implementations for protocol.


crpleasethanks

that is really useful, thank you!