Please sign in to access this page

tinyentities

tinyentities Used AI

11 devlogs
11h 57m
•  Ship certified
Created by Kendell

You know HTML entities - like how > is the greater than sign, and how © is ©? Most JS libraries to encode/decode them are very bloated. I'm making a more lightweight one.

Timeline

and now that the core work is done, i worked a little on the documentation: updated the package.json to include some relevant fields and added example usage to the readme. the example for the tryReads uses a TransformStream which i think is neat.

Update attachment

when you make something in a day, the last parts usually end up a bit unfinished. this happened with tinyentities, where stream decoding took 2.5x the time of entities. but note the past tense: i fixed that by not using regex (as much) and now it's neck and neck!

Update attachment

Ship 1

1 payout of shell 106.0 shells

Kendell

about 1 month ago

Kendell Covers 9 devlogs and 9h 59m

i've added something that would be useful if you need to stream in entities - it basically lets you keep trying to read something that looks like an entity until you figure out whether it is (and you can emit it as its encoding) or it isn't (and you can emit it as text)

Update attachment

i updated the benchmarks to separate init and runtime and from there i just got optimizing. i was able to speed up data unpacking, switch how the map used for encoding purposes is stored for speed, convert xml encoding into one call for speed, switch to regexes more optimized for my purposes (more importantly faster), and fix multiple bugs multiple times... streaming parser next i guess

Update attachment

i set up some benchmarks (well it was a group effort with ai). now we can group functions into:

  • absolute best: escapeHTML, escapeXML, escapeXMLAttribute, decodeHTML, decodeXML
  • best only through bundle size: escapeHTMLAttribute, encodeXML
  • absolutely beat by entities: encodeHTML

so room for growth

Update attachment

and now decoding. decoding can be more complicated once i add streaming, but i've gotten a REALLY simple implementation that works going. (also optimized/restructured map.ts a little to add support for decoding and better tree shaking)

Update attachment

i set up the encoding functions. details:
- i read through code in entities and dom-serializer to figure out the services i need to provide at the end of the day
- i implemented the lighter escapeHTML, escapeHTMLAttribute, escapeXML, and escapeXMLAttribute, which escape just enough to not have problems
- then i implemented the more complex encodeHTML and encodeXML, which encode almost everything, with the former even encoding punctuation and multi character entities when possible.
- i also signalled to bundlers that the mapping can be dropped if unused by wrapping the process of loading it in a pure IIFE (the (() => { code })() things)

Update attachment

i offset the bundle size increase a little: based on any time an entity exists without a semicolon, it also exists with a semicolon, i can only include the semicolonless version and it's implied that the semicolon version also exists. 7742 bytes -> 7520 bytes

Update attachment

i'm now actually generating the map. the map is a bit bigger now so it can include special cases like &amp (note how there's no ;, so my script packs that as !).

Update attachment

i got the initial version of the mapper going. the idea is to convert the 145.8kb entities.json into a highly optimized map by making some optimizations (for example, we assume there's always an increment of one between codepoints) and restructuring (like separating each first character level with a newline and each second character level with the > character). it's only 7.1kB gzipped (half the size of a naive mapper) and should work for both encoding and decoding.

Update attachment

i set up the project. i'm not used to making libraries - i usually make web apps - but i'm using tsdown for compilation/transpilation, which i hope will make things easy.

Update attachment