BeyondVocab v1 Launch

BeyondVocab is launching!

After a few weeks of testing and tweaking, and after a few months of development --- really, I told my girlfriend that I had "put a bow on it" at Christmas --- I've gotten BeyondVocab v1 (the Chinese-only Anki Add-On version) to good place to share with the world. I'm now using it for my own Chinese vocabulary practice, studying about 200-300 cards with it every day. It's useful to me, it's rather polished, I have other ideas in my head, and so, even if there is a long list of improvements that I have written down in my project directory, I think it's time to publish it and shift my attentions.

Some useful links to explore:

Why did I keep working on it after v0?

I had by late December 2024 managed to get a quick port of my "Chinese Learning Shiny App" hosted up on my website(here). Let's call that v0. v0 was, from an accessibility perspective, a solid improvement over the local version that I had been holding on to since ~2019, but it was not in truth useful. I did not trust it to safely or accurately interact with my Anki collection. While v0 is a toy with a similar UX to v1 (and actually, the facts that it's browser-based and that I didn't hobble myself with Anki/PyQT abstractions I half-understood actually make it zippier and more pleasant in some regards), and while I could pretend to use it in front of people, it did not satisfy me. I did not put it down like I had told people I would, and picked up again immediately after the holidays.

Rather than continuing to beat my head against the raw Anki database files, I took another look (this time with the help of Claude, Aider, and GPT o1) at the Anki python and rust codebase, and I decided that in 2025 I could succeed where in 2020 I had failed. (I recall in 2020 making it a short distance toward building an Anki Add-On but having been thwarted by my inability to package numpy and pandas dependencies. Given my data scientist background, the need to work with tabular representations of objects was absolute.) With much, much help from AI assistants, I was able to parse out of the source code the necessary unofficial and undocumented API hooks to let me idea flourish from within Anki. The realization that I would be able to make my "Chinese Sentence Suggester" live within Anki gave me hope that v1 would come with two large advantages: (1) it would be safe to use, relying as it would on Anki's API; and (2) other people might actually use it, since it could be "trusted" in Anki's ecosystem.

So, I got to work. I learned a lot. Some of what I learned was fun, eye-opening, or otherwise useful:

  • Having to worry about user friendliness, especially w.r.t. configuration settings
  • Building a landing page that I actually wanted to be attractive! (https://www.beyondvocab.com/)
  • Setting up Google Analytics
  • Setting up Vercel edge functions and a managed database (Supabase)
  • Structured outputs and packaging LLM's into an app

Some of what I did was very distasteful:

  • Jan/Feb: too much optimistic excitement about "agentic" AI coding assistants. I let Aider/Claude run wild on some features for me, leading to many many hours of tech debt that I would need to unwind manually. With great power comes great responsibility.
  • PyQT. No, I don't think I'll ever use this again. Yes, I needed to use it to give a native Anki feel. Yes, I should have invested a few hours in a QT tutorial before diving straight in.

So what is BeyondVocab v1?

BeyondVocab v1 is basically an alternative review interface within Anki that will let you read sentences/paragraphs in Chinese that contain your upcoming words and give you credit for having reviewed them. It has both (a) a flexible interface for generating (via feeding your words to an LLM) or inputting Chinese text to study and (b) an interactive reader interface for review with convenience features like auto-lookup, auto-add, auto-translate, etc. Here's a screenshot of the prep screen. Here's a screenshot of the review screen.

I think of it like a "hyper-personalized graded reader with spaced repetition".

I guess this will be useful to people who:

  • Already use and like Anki
  • Are itching to consume more natural language
  • Love quantitatively tracking their progression

Will I pick it up again?

I have dreams about a BeyondVocab v2, a standalone web app that does its own backend and frontend. But I realize that doing it right, building a scalable back-end and managing a userbase while ensuring a good experience and continued integration with Anki, would be several more months of work for me. Or several tens of thousands of dollars to hire it out (an option!). This is an exciting idea to me from an enterprising perspective, but it would not further my own educational goals very much. I would strongly consider walking down this road if there is strong feedback from BV1.0.

A more natural and less challenging stepping stone to a v2 would be adding (a) multi-lingual support and (b) a BV managed (and paid) account that stands in for users' needs for an API key while (c) still existing as an Anki Add-On. This is doable for me personally, but, again, I won't go down that road unless I want to learn another language, or unless there is strong user demand.