AI inference cost reduction