What's happening with recordings?

Hey all:

A bit over a week ago we released the first half of the recording subsystem of BobRTC out to everyone so that everyone could get the ability to have their calls recorded and also get the ability to download the audio as well as get the scammer's side of the conversation recorded clearly.

This included the ability for server-side retention of your recordings so you don't have to download everything. Here's what we learned so far:

  • - As of right now we’re now storing roughly 12GB of data every 24-hours.
  • - The archive solution we were experimenting with has proven to suck. Bad. We can only get 2 files a second moved to archive storage and it is so slow that we cannot really afford to move files there if any user wants to browse and listen to the audio---the files have to be copied back which is also a slow process.
  • The archive situation has caused us to pause work on the public website for sharing audio recordings publicly on the Internet until we sort out what to do about storage. This is what we've learned so far:

  • -

    We needed a new vendor for cold storage of audio data that is affordable. We have found one vendor that can satisfy this requirement in the cloud. Else we’d have to use colo boxes and find a cheap hosting provider that can keep a colo box with a lot of spinning hard drives going, AND not kill us with network charges. This is surprisingly difficult to do.

  • -

    We need the ability to write hundreds of files a second, not 2 files a second, since we're writing new audio tracks very frequently thanks to Rapid Redial. Luckily testing the new vendor has shown some promise.

  • -

    We need to be able to afford this and also adjust our retention policies around what we can afford to do.

  • On that last point:

    The only way we can really maximize the usefulness of cold storage is for users to manually pick which files are worth saving. The Recordings screen is designed with that in mind but right now you do not have the ability to mark files for saving because we do not want to commit to retaining files until we know we can do it properly. Similarly, creating a public link is also not there because we do not have a cold storage archive solution yet.

    Given the high rate of data that we are recording files at, the 2-day, 7-day and 30-day retention features for unsaved files will need to go away. Instead it makes far more sense to have a minimum 48-hour retention period for unsaved audio for everyone, usually longer than that if we switch to an algo where we delete the oldest unsaved files to make space for new audio.

    Similarly, we need to add a switch on the dial screen where you can shut off recordings temporarily. If you are XP farming you will build up a lot of undesired audio which will make searching for scambait calls that you did a pain. This helps reduce the amount of unwanted audio we're storing.

    Once we're fully established with the new cold storage vendor we will finally turn on that checkbox you can't click on in the Recordings screen. When we clean up the hot storage we'll relocate your saved calls over to cold storage where we can retain them long-term. Right now we cannot guarantee it will be there forever, but we should easily be able to hold on to the audio file for a year and not have it blow our small budget. The cold storage is fast enough that we can move it out of our data center to the vendor and if you want to play the file again or if you publicly shared the audio, it can be pulled back from the vendor and streamed back to you on demand. We already have a 3GB caching system for audio which stores requested audio in RAM in case an audio file goes viral.

    Lastly we're looking at options for storing likes/favs against your audio tracks so you can see how many people liked/fav'd your funny calls. This will be for the public website which will have advertising and be social media/search engine friendly.

    And we are aware of existing bugs/kinks with delays in audio getting posted. This is due to files moving around from machine to machine and the sensors not correctly picking up when a file has been fully moved from one processing stage to the next. It's working for 99% of the audio files but it's dropping some audio recordings on the floor. We're still investigating that.

    This feature of BobRTC is pretty massive and it's one of the features that does cost us money to maintain. We're hoping to get you a working system that's useful for everyone and hope you can bear with us as we sort all the complexities out.

    Would it be possible to insert a manual recording feature similar to firertc? I’m not sure if this is even possible so please forgive the possible naivety of this question. I was thinking that could possibly help with the storage issue and costs. Again, please forgive me if this question is naive. Cheers.