SilverKey Monitor

Real-time SQL caching for Postgres and MySQL

published on 2024/02/21

ReadySet is a transparent database cache for Postgres & MySQL that gives you the performance and scalability of an in-memory key-value store without requiring that you rewrite your app or manually handle cache invalidation. ReadySet sits between your application and database and turns even the most complex SQL reads into lightning-fast lookups. Unlike other caching solutions, it keeps cached query results in sync with your database automatically by utilizing your database’s replication stream. It is wire-compatible with Postgres and MySQL and can be used along with your current ORM or database client.


Object Storage and In-Process Databases are Changing Distributed Systems

published on 2024/02/20

But why stop there? When storage is an open format, like Parquet, and the data systems, like SQLite or DuckDB, can be embedded in-process in as little as tens of megabytes of memory, and perform queries on data sets that are much larger than memory, it becomes feasible to take these workloads, even exactly the same code, and move them from the cloud directly on the IoT device or IoT gateway. This can drastically reduce costs (cloud networking, computation, and storage; cellular networking); improve latency and reliability; offer traditionally cloud services locally, independent of the cloud, or with intermittent connectivity to the cloud; improve data privacy by keeping sensitive data locally; and support direct integration with third-party services (service providers for installation and maintenance; platforms for command and control; enterprise historians in industrial environments).[10]

Colin Breck

Groq is a Language Processing Unit

published on 2024/02/20

An LPU Inference Engine, with LPU standing for Language Processing Unit™, is a new type of end-to-end processing unit system that provides the fastest inference for computationally intensive applications with a sequential component to them, such as AI language applications (LLMs).

The LPU is designed to overcome the two LLM bottlenecks: compute density and memory bandwidth. An LPU has greater compute capacity than a GPU and CPU in regards to LLMs. This reduces the amount of time per word calculated, allowing sequences of text to be generated much faster. Additionally, eliminating external memory bottlenecks enables the LPU Inference Engine to deliver orders of magnitude better performance on LLMs compared to GPUs.


You can try the tech demo here. It is fast!

PRQL is a modern language for transforming data

published on 2024/02/20
  • PRQL is concise, with abstractions such as variables & functions
  • PRQL is database agnostic, compiling to many dialects of SQL
  • PRQL isn’t limiting — it can contain embedded SQL where necessary
  • PRQL has bindings to most major languages (and more are in progress)
  • PRQL allows for column lineage and type inspection (in progress)


In defense of simple architectures

published on 2024/02/20

There are some kinds of applications that have demands that would make a simple monolith on top of a boring database a non-starter but, for most kinds of applications, even at top-100 site levels of traffic, computers are fast enough that high-traffic apps can be served with simple architectures, which can generally be created more cheaply and easily than complex architectures.

Despite the unreasonable effectiveness of simple architectures, most press goes to complex architectures. For example, at a recent generalist tech conference, there were six talks on how to build or deal with side effects of complex, microservice-based, architectures and zero on how one might build out a simple monolith. There were more talks on quantum computing (one) than talks on monoliths (zero). Larger conferences are similar; a recent enterprise-oriented conference in SF had a double-digit number of talks on dealing with the complexity of a sophisticated architecture and zero on how to build a simple monolith. Something that was striking to me the last time I attended that conference is how many attendees who worked at enterprises with low-scale applications that could’ve been built with simple architectures had copied the latest and greatest sophisticated techniques that are popular on the conference circuit and HN.

Our architecture is so simple I’m not even going to bother with an architectural diagram. Instead, I’ll discuss a few boring things we do that help us keep things boring.

We’re currently using boring, synchronous, Python, which means that our server processes block while waiting for I/O, like network requests. We previously tried Eventlet, an async framework that would, in theory, let us get more efficiency out of Python, but ran into so many bugs that we decided the CPU and latency cost of waiting for events wasn’t worth the operational pain we had to take on to deal with Eventlet issues. The are other well-known async frameworks for Python, but users of those at scale often also report significant fallout from using those frameworks at scale. Using synchronous Python is expensive, in the sense that we pay for CPU that does nothing but wait during network requests, but since we’re only handling billions of requests a month (for now), the cost of this is low even when using a slow language, like Python, and paying retail public cloud prices. The cost of our engineering team completely dominates the cost of the systems we operate2.