Skip to main content

1-bit LLM Model

 


One of the recently released papers on 1-bit LLMs is in talks. the main idea is to quantize the weights in the range of (-1,0,1) instead of using all floating point numbers as traditional LLM. 



The main motive for converting all the floating ranges to (-1,0,1) is as follows 


with this new range being implemented all the floating point multiplications are gone, making the model much more efficient in saving Hardware and GPU.

This is also called 1.58 bit because since 1 bit can only hold two possibilities (1,0) we are also considering -1 right.

The Formula involved in converting the floating point range to (-1,0,1) is as follows



Performance in terms of memory Latency and hardware:




Though the performance in saving resources is great its ability to produce the accurate answer is not so evident till now.







Comments