Research

In-depth technology research, analysis, and expert insights on emerging trends.

CPU-Based Embedding: No GPU Needed

CPU-Based Embedding: No GPU Needed

Deployed a high-performance RAG system using open-source Infinity server with BGE-Large embeddings and MiniLM reranking, achieving 5%+ hallucination reduction while eliminating API costs on CPU-only hardware.

November 19, 2025 Ryan Wong

Read more

AI RAG embeddings reranking Infinity BGE-Large self-hosted cost-optimization
Best Open Model for Real Prompts

Best Open Model for Real Prompts

Having tested top AI models on real-world tasks, GPT-OSS-120B leads in technical performance, Qwen3 excels at research, while GPT-5 and DeepSeek shine in coding and analysis. See the full benchmark results.

October 18, 2025 Ryan Wong

Read more

AI LLM benchmarks model-comparison GPT-OSS Qwen3 DeepSeek research