← heapsort-ai

data science

53 items

CASE↑ trendingReddit r/LocalLLaMA·4/23/2026

Qwen 3.6 27B is a BEAST

A user reports that Qwen 3.6 27B, run locally on a laptop, excels at data science tasks like tool calls and data transformation debugging. Its performance was so impressive that they are considering canceling cloud subscriptions, finding it perfect for pyspark/python work.

56
RESEARCH↑ trendingReddit r/MachineLearning·4/23/2026

8 inputs → 58 body params: putting a body-model forward pass inside the training loss [P]

A small Multi-Layer Perceptron (MLP) model accurately predicts 58 Anny body-shape parameters from 8 questionnaire inputs, outperforming existing photo-based and linear regression methods. The model's innovative training loss function is key to its superior accuracy, achieving low Mean Absolute Errors for critical body measurements.

43
ARTICLEDEV.to AI·3d ago

<think>

This article compares open-source AI APIs with self-hosting models for small data science practices, focusing on cost and practicality. It offers a data scientist's perspective on choosing the optimal approach.

30
ARTICLEDEV.to AI·4d ago

<think>

A data scientist explores cost optimization in large language models, detailing API price comparisons for models like GPT-4o, DeepSeek, and Qwen. The article demonstrates how strategic use of a unified API platform can lead to significant savings, presenting statistical data and practical examples.

29
DOCDEV.to AI·4/25/2026

Pandas DataFrames: Your Data Spreadsheet

The content explains that Pandas DataFrames are essential for handling real-world, mixed-type data in AI and data science, serving as a labeled spreadsheet compared to NumPy's pure number grids. It introduces DataFrames as tables with labeled rows and columns, providing a Python example.

28
RESEARCHarXiv CS.LG·5/8/2026

Data-Driven Variational Basis Learning Beyond Neural Networks: A Non-Neural Framework for Adaptive Basis Discovery

This manuscript introduces Data Driven Variational Basis Learning (DVBL), a novel non-neural framework for learning data-adaptive basis functions directly from high-dimensional data. It provides an explicit, interpretable, and mathematically transparent alternative to neural networks for representation learning, addressing their limitations in control and transparency.

27
RESEARCHarXiv CS.CL·22d ago

Automatic Construction of a Legal Citation Graph from 100 Million Ukrainian Court Decisions: Large-Scale Extraction, Topological Analysis, and Ontology-Driven Clustering

This study details the automatic construction of a legal citation graph from 100 million Ukrainian court decisions. The analysis reveals that judicial citation structure encodes legal domain boundaries and predicts future legislative importance with high accuracy.

27