26:05Very interesting.
26:06I have to look more into that.
26:08So, besides choosing the right CRDTs
to fit into SQLite through virtual
26:14tables I think this, like, just picking
SQLite to run in the browser quite a
26:21few years ago, that's probably you've
must have been running into a bunch of
26:25challenges and issues along the way.
26:27So I'm curious to hear first getting
sort of like a broad overview of
26:31the different challenges you were
running into and then going into them.
26:35Yeah, I think I wasted months of my
life, like fighting SQLite issues.
26:39Cause yeah, it was really early where
the official SQLite WASM project
26:44had kind of like just started.
26:46And I guess some of the first
problems I ran into was.
26:51I guess compiling an
extension into the WASM build.
26:55So yeah, it's one thing to
write a SQLite extension.
26:57It's another thing to get
it to run correctly in WASM.
27:03It wasn't so hard when the extension
was originally written in C, but like,
27:06I don't know, like trying to code review
contributions in C and make sure there's
27:10no like memory issues was a nightmare.
27:13So I eventually like bit the bullet,
started learning Rust and re implemented
27:16the extension in Rust, which I think
was, yeah, it became like a third
27:21or two thirds less code and just way
faster to write after doing that.
27:24I think it was worth the time.
27:26But that introduced this whole new
complication of like getting the
27:29Rust compiler toolchain to compile an
extension that can be linked to the
27:33SQLite compiler toolchain that then goes
through Emscripten to build a Wasm bundle.
27:39And let me see what else was there.
27:42Transactions was an interesting thing.
27:45So the inner, the originally
the interfaces to SQLite
27:49in the browser were async.
27:51And if you ever had Well, I guess
there's two builds of SQLite.
27:55The official SQLite build
was always synchronous.
27:57There's this unofficial build, which I
think is better, called wa-sqlite, which
28:02all the interfaces were asynchronous.
28:03But it had these interesting caveats,
like if you await two calls to
28:07SQLite at once, it would deadlock.
28:09So you had to make sure like there's
only one Call to SQLite ever at once.
28:14So, you know, working
around that challenge.
28:16Transaction interleaving was another,
so like if it's asynchronous and you
28:20start a transaction and you have all
the statements in the transaction, okay.
28:24You, you began the transaction
and then you like do a read,
28:26you're waiting the read.
28:28Well, while you're waiting in the read,
you delegate control of the event loop,
28:31so somebody else could come in and,
you know, kick off a new task, right?
28:36A new statement to do
a write or something.
28:39Well, now that's somebody else's
transaction that's trying to run while
28:41your other transaction is running, so
you get this, like, weird interleaving.
28:44So, yeah trying to lock out transactions
in the JavaScript layer that was
28:48another annoying thing to deal with.
28:49And then, yeah, this is not related
to WASM, but just getting SQLite
28:56with the extension built for all
the platforms I wanted to target.
28:59So, like, eventually Expo wants to
include it in their Expo SDK, so figuring
29:04out how to do Android and iOS builds in
addition to, you know, the WASM build.
29:09And Mac and Linux builds.
29:12Those sound like quite the buffet of
different challenges that some sounds
29:18familiar to me as well as I think some
you've been pioneering a tad earlier.
29:23And by now, some of those
are maybe a bit more.
29:27A well trodden path, but I
think there's probably even more
29:31that we don't understand yet.
29:33One more, I know one more interesting
one I think more interesting than
29:36the other ones is since SQLite
in the browser, at least the one
29:41I'm using, is IndexedDB backed.
29:43IndexedDB has a very high cost
for opening a transaction which
29:48I was unaware of at the time.
29:49So yeah, there's a lot of work.
29:51And anytime you open a SQLite transaction,
it would open an IndexedDB transaction.
29:56So there's a lot of work making sure
that, like, for reads Yeah, normally
30:00you just issue reads, you wouldn't
necessarily batch them into a transaction.
30:03Like normally, you know, you'd batch
your writes into a transaction.
30:06But yeah, to make SQLite in the browser
go fast if you're doing a whole bunch
30:09of reads, like making sure a transaction
was open automatically all the reads
30:14would happen and then the transaction
would close once all the reads are done.
30:16So it's like way of batching that was like
a 10x speed up in the browser for SQLite.
30:21So once you've solved all of those
performance challenges and also like
30:26correctness, syncing challenges,
et cetera, then it's also needs
30:30to be competitive somehow with
the typical developer experience
30:34you have in a React app or in a
different framework you're using.
30:40How did you go about designing that?
30:42Yeah, so, yeah, I've developed a set of
hooks, so, I don't know, it's, I guess
30:47like the React integration was probably
the least amount of, I don't know,
30:51I'd say it's the least amount of work.
30:52Maybe not once I like think about
all the strict code problems and
30:55stuff but yeah, essentially the
DevEx was like a use query hook.
30:59So I think it's a lot easier to
develop when your queries are
31:03co located with your components.
31:04So every component is responsible
for getting the data it needs.
31:08And this makes your app like
super composable, right?
31:10Like you can add and remove components
to your app or your component tree.
31:15And because they're responsible
for getting their data, like, you
31:18don't have to do any additional
wiring or prop drilling.
31:20And yeah, they're not going to break
other components because suddenly,
31:26Some data dependencies are missing.
31:28So I really like that, yeah.
31:29Co located queries, so every component
you could use this hook called UseQuery.
31:34And in that UseQuery, you just
literally write your SQL that fetches
31:39the data required by that component.
31:41And that SQL could be as complex
or as simple as you wanted.
31:44So, you know, like, select from issue
where id equals issue id or something
31:49simple or, you know, something
complex like select star from issue,
31:52order by modify, join labels join
owner, and yeah, all these things.
31:58So you've managed to tame most challenges
with SQLite in the browser, making sure
32:03it works in Wasm, making sure that your
own extension is working, et cetera.
32:08And SQLite can be super, super fast
for most workloads, but once your
32:14cardinality grows, et cetera, your
tables get really big then queries can
32:19also start Taking a little bit longer.
32:22And that is because SQLite always kind
of, besides a little bit of caching,
32:26always starts with your results for
your queries from scratch, kind of.
32:31And you know, I've been chatting about
this for quite a while back then.
32:35This is how I know about
all of this context.
32:37And that has led you to look into.
32:40incrementally maintaining the
query results and the views.
32:43So I'm very curious to hear more
what you've learned back then
32:46and which path this led you down.
32:49Yeah.
32:50Yeah.
32:50So I guess, you know, I was talking
earlier about each component
32:53fetches its own data, has this
use query hook, or just, you know,
32:56specifies a SQL query it's running.
32:58So when you're developing like a rich
client side application like this,
33:02ideally your database is reactive, right?
33:04So any, anytime some data changes.
33:07It's like you, you work
on Overtone, right?
33:09The music app.
33:10If somebody presses play, then, you
know, all the components that care about
33:16play state need to immediately update.
33:18And yeah, as you're saying in SQLite, if
you're doing this somebody changes some
33:22state, well, you know, SQLite doesn't have
any, you know, facilities for reactivity.
33:27It has a few but they're not very good.
33:29Like they don't give you many fine
grained notions about what changed
33:32or like what queries it impacts.
33:34So what you're left doing is rerunning.
33:38Essentially from scratch the queries
that could have been affected.
33:42So, like, the SQLite change notifications,
they'll tell you the row ID that
33:46changed, but not the contents, and
they'll tell you, like, the table.
33:49So it's usually not enough to figure
out exactly which queries to invalidate.
33:52And yeah, so you rerun tons
of queries throughout the app.
33:56And those are all rerunning from
scratch, so like, if you have, like,
34:00for your track list, you're showing a
few hundred tracks, and that track list
34:03has to do a number of joins, right?
34:05Has to, for a track, has to join the
album to get the album title, has to
34:08join the artist to get the artist names.
34:12Yeah, I don't know if
there's any other joins.
34:13Maybe there could be joins about,
like, like status, or favorite status,
34:18or something about the track, right?
34:20And, yeah, rerunning that join to
select, you know, 200 some items.
34:25And yeah, maybe they just want to
sort, and sorting it doing that from
34:28scratch every time somebody mutates
something severely limits I guess how
34:32fast you can interact with the app.
34:34And yeah, I'd worked, tried a
bunch with like, okay, you know,
34:38maybe a purely in memory SQLite,
and that gets you pretty fast.
34:42But yeah, somebody was saying,
oh, like, The goal for a reactive
34:45database is it's memory fast.
34:47And then when I started, like, I
don't know, that idea, a lot just
34:50helped my brain and I started like
actually benchmarking, you know,
34:54if I'm just observing a value
in JavaScript, how fast is that?
34:57And if I'm like, Observing a
query in SQLite and updating the
35:02row and re running the query.
35:03Like, what is that?
35:04And I have a observable notebook
somewhere that like compares all this
35:08and like the difference was massive.
35:09So like just seeing how far I was
from memory fast, like started
35:13making me a bit sad on SQLite.
35:15And also like benchmarking the original
strut and the new one, which use
35:20SQLite and these use query hooks.
35:23Yeah, like the original one I could
throttle my Chrome, you know, but in
35:27the dev tools you can like downgrade
your CPU right to like 6x slower or
35:31something and I could go all the way down
to the slowest possible and everything
35:35was buttery smooth, but in the new
one, like if I downgraded it too much,
35:40like you could see some visible lag.
35:42SQLite
35:45is fundamentally built in this
request response style, right?
35:50This is the era of the LAMP stack
where, you know, databases request
35:55response made sense, right?
35:56A user would go to the website
it would do a query against the
35:59DB and it would render, right?
36:01There was no, like, rich interactivity.
36:03It was always a full
page refresh every time.
36:04So request response made sense.
36:06Yeah, but for these rich apps, request
response no longer makes sense.
36:09And trying to fit a request response
sort of DB designed around that,
36:13it into this reactive scenario.
36:15And I think maybe one day
somebody can get there.
36:19But I think right now it didn't seem
to make much sense to me and the juice
36:23didn't seem to be worth the squeeze.
36:25So I started, I guess my love affair with
SQLite has slowly been coming to a close.
36:30And I started, you know,
investigating other projects.
36:33Yeah maybe one, one day
port it if there's time or.
36:36porting some of these projects
back to SQLite and contributing
36:39incremental data flow, differential
data flow to them or something.
36:43But yeah, I guess I should describe
what these projects are, right?
36:45, So I like the idea of queries, being able
to like, Issue a complex declarative query
36:52against some set of relational data to
get back the data you need for your view.
36:56So I wanted to like, how can
I , incrementally maintain an
37:00arbitrary query against some data?
37:02So like that track list example
you query the, Tracks and their
37:06artists, and their albums, and the
play state, and all these things.
37:10How can we, when somebody does
it right, rather than re running
37:13the query to get the track list
it knows exactly which queries.
37:17Should be invalidated by that write,
and rather than re running them,
37:22knows exactly how to patch up the
rows that are impacted by the write.
37:27So, you know, at first this problem
seems absurdly complex and hard.
37:32And you're like, Oh, yeah, I
want to like invalidate and
37:34patch up any arbitrary SQL query.
37:37But yeah, I started
reading some of the papers.
37:39There's like a paper called DBSP.
37:41It's a differential data flow paper.
37:43I think Materialize is based on that.
37:47And then, yeah, started implementing it.
37:49And then I realized like, I don't know,
this, once you've read it and started
37:52implementing it and you realize like,
this is very similar to other stream
37:56processing stuff , like, and other query,
you know, builders I've done,, right,
38:02where rather than, you know, you create
a series of operations and rather than
38:06them asking a database for the data,
they're like taking streaming data in.
38:12So yeah, like some of the work I'd done
at Meta was on real time abuse systems
38:17and that was all streaming systems.
38:20So I was like, oh, like this
problem isn't that mysterious.
38:22Like I've done stuff like this before.
38:25It seems tractable.
38:26So I, you know.
38:27went full on, decided like, I think we,
I think I can implement this yeah, based
38:32on prior experience, based on having
done query languages before so yeah,
38:36MaterialLite was born which is bringing
differential data flow to JavaScript, so
38:43you can compose, filter, map, reduce, and
a join operator to create these pretty
38:48rich queries and have them reactively
updated anytime there's a write.
38:53Right.
38:53And so for those of the listeners who
have not yet built their own database
38:58and maybe are familiar with like SQL
where in SQL, you have like select from
39:04where, and all of those and JavaScript
or other programming languages who you
39:08might've like, Map, filter, et cetera.
39:11They might have different
names, but they're sort of
39:13like conceptually very similar.
39:15And this is where you're basically just
now like trying to recreate the same
39:20semantics that we have from SQL, where you
can say select star from this where so,
39:25and you can basically have like an array
where it now say .Map .Filter, et cetera,
39:32and where you flip the trade offs from
before you owned in SQLite, you might get
39:38this query just once and then it's done.
39:40You no longer are interested in like
subsequent changes and then like, smaller
39:45updates, but in JavaScript where our
app stays warm, we click the button.
39:49Something's changed slightly.
39:52You want to change the trade
offs quite significantly.