candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-21 20:22:49 +00:00

Author	SHA1	Message	Date
Jorge António	75b6d4b0da	add config for mamba 2.8b model parameter (#1946 ) * first commit * Make the mamba config public. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-03-27 07:47:23 +01:00
Laurent Mazare	455c42aa72	Avoid copying the data on squeeze and unsqueeze. (#1884 ) * Avoid copying the data on squeeze and unsqueeze. * Fix the quantized llama example. * Unrelated fix for the quantized stable-lm example on cuda. * Fix for mamba on cuda (unrelated to the PR).	2024-03-20 13:04:36 +01:00
Laurent Mazare	1a6043af51	Tweak the VarMap set type. (#1758 )	2024-02-25 20:50:08 +01:00
Laurent Mazare	1e26d539d9	Improved mamba model optimized for inference (#1694 ) * Sketch the mamba model for inference. * Complete the forward pass. * Add the mamba example. * Optimize the selective-scan part. * Fix a couple shape mismatches and get inference to work. * Tweak the readmes. * More readme tweaks.	2024-02-11 17:04:57 +01:00