https://blog.kog.ai/real-time-llm-inference-on-standard-gpus-3-000-tokens-s-per-request/2026-05-29T11:36:49.996Zhttps://storage.ghost.io/c/07/80/0780d8de-243c-4e4d-9876-7e3ee2a55df5/content/images/2026/05/blog-post-reduced-2.pngblog-post-reduced-2.pnghttps://blog.kog.ai/delayed-tensor-parallelism-for-faster-transformer-inference/2026-05-29T09:21:58.000Zhttps://storage.ghost.io/c/07/80/0780d8de-243c-4e4d-9876-7e3ee2a55df5/content/images/2026/05/kog_dtp_feature_image_fixed.pngkog_dtp_feature_image_fixed.pnghttps://blog.kog.ai/building-a-single-kernel-latency-optimized-llm-inference-engine-on-amd-mi300x-gpus/2026-05-28T16:20:41.000Zhttps://storage.ghost.io/c/07/80/0780d8de-243c-4e4d-9876-7e3ee2a55df5/content/images/2026/05/kog_monokernel_feature_v3-1.pngkog_monokernel_feature_v3-1.pnghttps://blog.kog.ai/kog-reaches-3-5x-breakthrough-inference-speed-on-amd-instinct-mi300x-gpus/2026-05-28T06:39:02.000Zhttps://storage.ghost.io/c/07/80/0780d8de-243c-4e4d-9876-7e3ee2a55df5/content/images/2026/05/kog_amd_cover-1.pngkog_amd_cover-1.png