Here are pointers to a few selected posts.

Series-A: Mobile LLM on Android

  • Chapter 1: cross-compiling and deploying llama.cpp
  • Chapter 2 (to be updated): latency and power measurement
  • Chapter 3 (to be updated): latency and power modeling
  • Chapter 4 (to be updated): DVFS for LLM: state-of-the-art
  • Chapter 5 (to be updated): DVFS for LLM (continued): future directions