Show HN: BI-LLM democratize data analysis
2 by kovezd | 0 comments on Hacker News.
Hi HN! BI-LLM ( https://ift.tt/vH9JZ3O ) automates data analysis workflows on semi-structured data. The assumption is that many decisions could be led by data, but instead are made by intuition because doing analysis would be too slow or expensive. The sweet-spot is decisions based on a few hundreds to a couple thousand items: too many to analyze manually, too few to dedicate an analyst to it. I believe that powerful insights are derived from a large number of data-driven decisions. There are 3 things special about BI-LLM: 1. Web-native: the code is fully developed on TypeScript. A native integration with web, avoids the need for specialized software, or licenses. More importantly, I hope it contributes opening the field of AI to a wider group of developers. 2. Runs-locally: BI-LLM uses a combination of statistical, machine-learning, and GenAI methods to automate the analysis. Embedding models run in TensorflowJS, and LLMs in Ollama. Running locally not only helps keeping costs down, it also contributes to data privacy, and security. 3. No data-background required: all the technical details when pre-processing data are hidden on the implementation. The analysis only requires a simple JSON file, and after a couple CLI commands, it outputs a full report with the conclusions in a language that directly leads to making decisions. The data analysis flow starts by turning texts into embeddings, and, using LLMs to label, and score the texts. Then, texts are grouped by similarity and the scores used to write an analysis describing each cluster, and explaining the correlation between labels, and attributes in the final report. You can see in YouTube a 2-minute demo ( https://youtu.be/RoLU_REypyY ) of a sample analysis. In GitHub, the demo directory also has screenshots from a sample analysis. You can also find diagrams with more detailed explanations of the steps taken during analysis on the ReadME. The project was partially inspired by feedback from the community. A few months ago I posted an analysis that stirred some controversy about whether it was written by AI. There were some encouraging comments including: > This is the perfect use case for LLMs and Data Analysis in general. I'd legitimately pay money for a similar productivity tool. The project took shape after speaking with an investor that had a perfect use case for it. The key insight was to move from doing unsupervised analysis (as the original article) to supervised analysis: attempting to predict a defined outcome. The final straw was when exploring some leads data, I clustered the data and so differences in the rate of conversion between 20% to 2%. The analysis required almost no work, and provided useful insights on how to identify the best prospects. I figure it out this could be a great tool for startups, and small teams that cannot afford a data analyst. I would love to hear your feedback. Thank you!
2 by kovezd | 0 comments on Hacker News.
Hi HN! BI-LLM ( https://ift.tt/vH9JZ3O ) automates data analysis workflows on semi-structured data. The assumption is that many decisions could be led by data, but instead are made by intuition because doing analysis would be too slow or expensive. The sweet-spot is decisions based on a few hundreds to a couple thousand items: too many to analyze manually, too few to dedicate an analyst to it. I believe that powerful insights are derived from a large number of data-driven decisions. There are 3 things special about BI-LLM: 1. Web-native: the code is fully developed on TypeScript. A native integration with web, avoids the need for specialized software, or licenses. More importantly, I hope it contributes opening the field of AI to a wider group of developers. 2. Runs-locally: BI-LLM uses a combination of statistical, machine-learning, and GenAI methods to automate the analysis. Embedding models run in TensorflowJS, and LLMs in Ollama. Running locally not only helps keeping costs down, it also contributes to data privacy, and security. 3. No data-background required: all the technical details when pre-processing data are hidden on the implementation. The analysis only requires a simple JSON file, and after a couple CLI commands, it outputs a full report with the conclusions in a language that directly leads to making decisions. The data analysis flow starts by turning texts into embeddings, and, using LLMs to label, and score the texts. Then, texts are grouped by similarity and the scores used to write an analysis describing each cluster, and explaining the correlation between labels, and attributes in the final report. You can see in YouTube a 2-minute demo ( https://youtu.be/RoLU_REypyY ) of a sample analysis. In GitHub, the demo directory also has screenshots from a sample analysis. You can also find diagrams with more detailed explanations of the steps taken during analysis on the ReadME. The project was partially inspired by feedback from the community. A few months ago I posted an analysis that stirred some controversy about whether it was written by AI. There were some encouraging comments including: > This is the perfect use case for LLMs and Data Analysis in general. I'd legitimately pay money for a similar productivity tool. The project took shape after speaking with an investor that had a perfect use case for it. The key insight was to move from doing unsupervised analysis (as the original article) to supervised analysis: attempting to predict a defined outcome. The final straw was when exploring some leads data, I clustered the data and so differences in the rate of conversion between 20% to 2%. The analysis required almost no work, and provided useful insights on how to identify the best prospects. I figure it out this could be a great tool for startups, and small teams that cannot afford a data analyst. I would love to hear your feedback. Thank you!