Talking Diary

Talking Diary

6 devlogs
22h 46m
•  Ship certified
Created by AuraRiddle

Ever hit that 2AM spiral like “what even was today?”
But you're too tired or just can't be bothered to type it all out?
Same.

So I am building Talking Diary — a voice-first, emotionally-aware diary that replies to you. You speak or type, it listens, talks back, helps you reflect, and writes your day for you.

Built to explore:

* Natural, human-like conversations
* Letting people vent without overthinking
* Designing a friend more than a log keeper

A little weird, a little personal, and kinda alive.

(Feedbacks are always welcome)

Timeline

Ship 2

1 payout of shell 19.0 shells

AuraRiddle

5 months ago

AuraRiddle Covers 2 devlogs and 3h 37m
Earned sticker

Added Text-to-speech support for Talking Diary. Had to tinker with many libraries including gTTS, pyTTSx3, espeak, and then finally settled to Piper. Although, much of its time wasn't tracked but it sure did take me more than 2 hours. But, yes... here it is - with a cute lady talking to you instead of bland text.

And yes, the modular system kept in mind from the starting did help a lot while adding new features like this. All of this organisation of files and modules does pay off. Also, I have added a new module for voice (input and output) titled as 'voice_io'. It will contain all the required functions for converting speech into text and vice verse. So, you have a little idea what's coming next after this TTS update.

Have a great day!

Update attachment

Final blow! This might be my last devlog before shipping the MVP. I have placed before you the final prototype of my product that works fine on Linux. I will be making a windows alternative executable for this as well as host it online as a web app, but what was most convenient for me was to make a Linux app since I myself use Linux Mint as my primary OS.

The demo link (https://github.com/yashnarang000/talkingdiary2/blob/main/demo.md) has all the instructions required to use the dummy version of Talking Diary with the first default personality - Kaya.

Attached screenshot is the proof of how functional the most basic version of Talking Diary (the first personality is named Kaya) is.

Update attachment

Ship 1

1 payout of shell 139.0 shells

AuraRiddle

6 months ago

AuraRiddle Covers 4 devlogs and 19h 9m

Hello, I'm back with another devlog and this one took quite a lot of time - 19 days. And there is, in fact, a reason behind it. My laptop's SSD got corrupted out. Majority of the data was lost, but thank god I had a backup. Also, Talking Diary was safe since it's repository was uploaded on Github. There have been other reasons as well for being late, and I apologise for it.

Now, let us focus on the important stuff. What we have built so far - all the necessary modules for our first prototype are ready and after connecting them, I had figured out that there have been problems that needed me to spend some time, figure their roots out, and fix them up.

So, here is how our first prototype works out -
1. You start main.py
2. Chatloop is initiated
3. You talk to Kaya - a friendly Talking Diary
4. You press esc when you're done expressing yourself
5. Wait until you see done
6. The diary entry is saved in Kaya/output.md

This is the most basic form of our project and it has the core functions. More will be added later - such as TTS and STT support when the MVP will be available to ship.

Update attachment

Pheww! These 6 hours of coding flew like nothing. In the previous devlog, I implemented the chatting function along with the feature to save and fetch from a jsonl file that saves the history of the conversation.

Now, it was time to go way more deep and technical because now it was time to program the modules that handle what happen in the background - journalization and data/log management.

What we still required:
1. A journalizer function - that converts text entries of your conversation with Diary to literal Diary entries that read real.
2. A lot of functions for managing log utility - what we needed to log:
- Session data (timestamp and number of entries per session)
- History of conversation yet to be converted into diary entries (this is different from the universal history)

Also, I upgraded from .json to using .jsonl for better flexibility and efficiency.

Now, in any talking diary instance, you will need these four necessary files:
1. universalhistory.jsonl
2. sessionwisehistory.jsonl
3. tempsessionlog.jsonl
4. session_log.jsonl

The functions are all ready in log_utils.py and journalizer.py.

What awaits is the first prototype, that you will be able to see soon!

Update attachment

Okay. Time for a devlog? Fine.

So as covered in the previous devlog, I created the automation module for unmute.sh website, that is now an available option to be used in the final software.
Moving further, I did some research, built modules that I don't think I ever needed to but still did - and finally deleted. It was more of my test in OOPs than a module I'd ever use, so... all say goodbye to that good for nothing model.

Recreated that module - but this time, no OOPs (trust me, it wasn't necessary, there are so many examples of modules that are famous for their raw functions, classes are optional).

Back to the research - it was on using LLMs in projects without hosting them ourselves. That's how I got to know about Openrouter, an treasure of LLM APIs that you can use in your projects. The limitation - you need to pay. Sure, there are free LLMs, but they have serious rate limitations and to remove them, you need to have atleast $10 of credits in your openrouter account.

For the learning part, Openrouter introduced me to the openai library in Python. It's an amazing module for implementing any LLM in your project, no matter if it is developed by Google or OpenAI. It's real versatile - just change the base URL and the source platform is changed.

I hope it's not too much and I'm doing the explanation right. Nevertheless, let me get this straight: one module, a hundred platforms, a thousand models to use. So, if I'm unsatisfied with the free rate limitation of Openrouter, I can just turn myself around to Gemini by just changing the base URL of the client and model name.

Considering all this, I made a module named api_adapter_openai.py which is just a bunch of functions that help me integrate the openai module in my software. Also, reusable in the future for another project :)

What has come true till now:

  • the real chatting thing, you can do it but text-only
  • it remembers chat history in a .json file of your choice, or starts fresh on every loop initialisation - it all depends on you.
  • Instructable: you can instruct it to be whatever you like. Real thing right there!
  • attachable to LLM APIs to your choice

Attached is a simple demo of what I did create.

Update attachment

Made the plan and did some initial code today.
The system is modular. Two AIs will be used in the project.

One AI will chat with the user.
The other will work in the background, converting the conversation into daily diary entry.

Four major functions of the Talking Diary:
- Speech To Text
- First AI (understanding)
- Text To Speech (natural and emotional tone)
- Second AI (to convert conversations into entries)

Nominations for first AI:
- Pi AI
- Unmute.sh (just experimenting right now)

Nominations for second AI:
- Gemini
- Any other LLM that could process large number of tokens efficiently

I've never in my life integrated local llms in personal projects, so I might try going that route or even hosting and paying for them on cloud.

And might even work with local opensource TTS and STT models.

For today, I just tried using playwright to make back end modules from websites like 'pi.ai' and 'unmute.sh'.

For making a module for Gemini, its official API can be used.

Update attachment