PrismML Emerges From Stealth With $16.25 Million, Claims 1-Bit LLM Matches Full-Precision Performance

PrismML, a startup founded The company's core assertion is that its quantization approach can reduce model weights to single-bit representations while preserving the performance characteristics of much larger, full-precision models. If validated at scale, the technology could substantially lower the compute and memory requirements for running frontier AI models, with implications for on-device inference, edge deployment, and the economics of data center operations. The funding round was reported 1-bit quantization has been an active area of AI research, with Microsoft publishing work on the technique in 2024 and other labs exploring the tradeoffs between compression and accuracy. Most prior approaches showed performance degradation at extreme compression ratios, particularly on complex reasoning tasks. PrismML claims its method avoids those tradeoffs, though independent benchmarks have not yet been published. The startup enters a crowded field of companies attempting to make AI inference more efficient. Competitors include Groq, which uses custom silicon, as well as software-level approaches from companies like Neural Magic and llama.cpp contributors. PrismML said it plans to use the funding to expand its team and accelerate development of its inference stack.

PrismML Emerges From Stealth With $16.25 Million, Claims 1-Bit LLM Matches Full-Precision Performance

Chinese Grey Market Resells Claude API at 90% Discount via Proxy Networks

Anthropic Details How It Improved Claude's Safety Training After Finding Agentic Misalignment

Palo Alto Networks Says Frontier AI-Assisted Analysis Matched a Year of Manual Penetration Testing in Three Weeks

Revised GUARD Act Narrows Scope to AI Companions but Keeps Strict Age Verification