Files
candle/candle-examples/examples/reinforcement-learning/README.md
Kyle Birnbaum 648596c073 Added readmes to examples (#2835)
* added chatGLM readme

* changed wording in readme

* added readme for chinese-clip

* added readme for convmixer

* added readme for custom ops

* added readme for efficientnet

* added readme for llama

* added readme to mnist-training

* added readme to musicgen

* added readme to quantized-phi

* added readme to starcoder2

* added readme to whisper-microphone

* added readme to yi

* added readme to yolo-v3

* added readme to whisper-microphone

* added space to example in glm4 readme

* fixed mamba example readme to run mamba instead of mamba-minimal

* removed slash escape character

* changed moondream image to yolo-v8 example image

* added procedure for making the reinforcement-learning example work with a virtual environment on my machine

* added simple one line summaries to the example readmes without

* changed non-existant image to yolo example's bike.jpg

* added backslash to sam command

* removed trailing - from siglip

* added SoX to silero-vad example readme

* replaced procedure for uv on mac with warning that uv isn't currently compatible with pyo3

* added example to falcon readme

* added --which arg to stella-en-v5 readme

* fixed image path in vgg readme

* fixed the image path in the vit readme

* Update README.md

* Update README.md

* Update README.md

---------

Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
2025-04-03 09:18:29 +02:00

787 B

candle-reinforcement-learning

Reinforcement Learning examples for candle.

Warning

uv is not currently compatible with pyo3 as of 2025/3/28.

System wide python

This has been tested with gymnasium version 0.29.1. You can install the Python package with:

pip install "gymnasium[accept-rom-license]"

In order to run the examples, use the following commands. Note the additional --package flag to ensure that there is no conflict with the candle-pyo3 crate.

For the Policy Gradient example:

cargo run --example reinforcement-learning --features=pyo3 --package candle-examples -- pg

For the Deep Deterministic Policy Gradient example:

cargo run --example reinforcement-learning --features=pyo3 --package candle-examples -- ddpg