Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shawn: Introduction to Rust (for Substrate / Polkadot) #134

Open
shawntabrizi opened this issue Sep 14, 2023 · 14 comments
Open

Shawn: Introduction to Rust (for Substrate / Polkadot) #134

shawntabrizi opened this issue Sep 14, 2023 · 14 comments
Labels
Content Addition Suggestions for new content additions.

Comments

@shawntabrizi
Copy link

shawntabrizi commented Sep 14, 2023

In Progress


This issue will capture my thoughts on the different topics that an introduction to Rust for Substrate and Polkadot should have.

I have explicitly not chosen to look at the existing content, and work from my own experience to compile a list of content that I think should be covered.

I will use the Rust Book as a basis for the content that people learn outside of Substrate and Polkadot.


I would start by having users go through chapters 1 - 4 of the Rust Book on their own. I don't think there is much value to reproduce this content on our side, as we don't have much to expand on, and it would end up being copy paste.

I do think we need to have our own "installation" section, and basic commands needed to keep rust working in our ecosystem:

  • rustup update (always)
  • install wasm target
  • cargo fmt
  • etc...

I think we should also give a nice and clear warning that rust-analyzer does work, but not super well. We should probably include a special section on specifically configuring rust-analyzer for Substrate projects.


There needs to be a special section which specifically talks about the wasm32-unknown-unknown target, and no_std. This should be framed as entirely Rust things, which are then almost always used in Substrate.

Many of the examples throughout teaching Rust will use things like println!, rand, and many other no_std incompatible apis. They should already be prepared to understand that these tools are not available to them and why.

This should cover the intricacies of adding new crates to a no_std project. Specifically the TOML file, and errors that can result from that.


We need special emphasis on panics, what kinds of code triggers a panic (unwrap, overflows, etc...), and how we can always write Rust in panic-safe ways.


Need special emphasis on floating point numbers, and how they are not useful in blockchain environments.

We should teach how we can use unsigned integers and create fixed point numbers.

Special emphasis on safe math operations, and how rust can "hide" overflows in productions.


We should have a special section about Strings in Substrate, how they are represented as UTF-8 characters in a Vec<u8>.

Rust book has an entire section on Strings, and we should emphasize you will likely never touch a real string type in Substrate. Or in blockchains in general.

This might bleed into SCALE, which I feel like it might need to bring into these lessons.

https://substrate.stackexchange.com/questions/1174/why-is-it-a-bad-idea-to-use-string-in-an-ink-smart-contract


Also a SCALE-ish topic, we should emphasize, along with the different data types and structures which are equal and not equal encoded.

For example, tuple and struct can be equal. Two structs where field order is different is not equal.


Special emphasis on designing systems with a builder pattern.

Private vs Public fields in structs, etc...

probably kind find some tangible examples of where to use builder pattern.


Special emphasis on different ways to structure projects.

  • start with a large project in a single file
  • create multiple impls on a trait
  • create multiple internal modules
  • move those to sub-files and sub-folders
  • handling importing of mods

Special emphasis on feature flags and feature gating. Specifically creating a new feature flag, no_std ,and tests feature flags.

explaining: #![cfg_attr(not(feature = "std"), no_std)]

https://substrate.stackexchange.com/questions/1274/why-do-we-need-cfg-attrnotfeature-std-no-std


Special emphasis on creating macros

  • #![recursion_limit = "1024"]
  • different kinds of macros, how they work, what they do, how they are useful, etc...

https://danielkeep.github.io/tlborm/book/README.html


Bounded vectors and other "bounded types". This certainly bleeds into SCALE, which probably should be here.

If we touch on scale, we should talk about what it does and how it works and stuff.


Generics and associated types.

When to use each.

https://github.com/shawntabrizi/substrate-trait-tutorial


fancy use of cargo

  • build a specific package
  • run a specific test
  • debug vs release vs production builds

Talk briefly about Box and how we use it in Wasm and other memory limited environments.


Crates interacting with each other through traits

  • using a third, "library" crate to connect two crates
  • showing problems with recursive dependencies

Conflicting associated type names, and how to resolve the rust error.

Inferred types in general, and how they work.

Processes for debugging this.


passing iterators vs vectors / slices to functions

working with iterators in general


rust modules vs rust libraries

in substrate, you mostly work with libraries

@shawntabrizi shawntabrizi added the Content Addition Suggestions for new content additions. label Sep 14, 2023
@shawntabrizi
Copy link
Author

shawntabrizi commented Sep 14, 2023

I will use this issue to make comments about actually going through the course.

We do not assume any existing knowledge of Rust and will start from the beginning. However, we assume some general familiarity with programming and that you are fluent in at least one other programming language.

I can understand this approach, but I think it is not a great one. I generally don't like that basic information is repeated from the rust book, and actually can discourage someone who already knows some basic rust from going through the course, and leads to long term needs to maintain things.

Better to reference specific chapters from the rust book, and just expand on why these things are relevant for Substrate, or make specific examples which contextually help with substrate.


end of the day, up to you, but less content to maintain is usually a good thing.

pointing users to other good resources is also a good thing

@shawntabrizi
Copy link
Author

Rust Safety by Example

The example presented is an example of type safety, but does not really match the description of safety provided above.

I think type safety should be more emphasized, and the example should be titled "Rust Type Safety by Example".

@shawntabrizi
Copy link
Author

Basic Wasm Architecture

This page could be expanded quite a bit. Especially describing more why Wasm is a great format for blockchains.

For example, sandboxed, deterministic, fast, etc...

benchmarks can be linked here showing speed of wasm compared to other environments.

Perhaps you should mention other environments students might be interested to look into besides wasm.

@shawntabrizi
Copy link
Author

shawntabrizi commented Sep 19, 2023

Installing & Setting up a Rust Developer Environment

Would suggest you link away for installing rust, as it may change for different operating systems (like windows)

suggest that you also include installing the wasm toolkit stuff just so that is already done and out of the way, and associated with the base rust installation.

Rust analyzer does not work very well with Substrate. Needs to be specially configured or suggested that we disable it imo... or at least a warning about the performance and the pros vs cons

I would suggest we link to or describe creating and running a basic hello world application locally, which closes the story on doing things locally.

@CrackTheCode016
Copy link
Contributor

CrackTheCode016 commented Sep 21, 2023

I agree with this line of thinking. We should look for ways to uniquely accompany the Rust book (and maybe even material like RustWasm) in a way that captures things that are useful for runtime development for Substrate.

I think I can see a good direction (I know you are still concluding your analysis). This is my current thinking for how it may go (up for interpretation, of course :) ) To be clear, it might be too much to cover, but I we can prioritize as needed:

Change Title: Intro to Rust for Substrate


We remove the first 3-4 modules of the Rust course, and condense the basics into a single module. Users will have assumed knowledge of syntax, which can be inferred on the way anyways. Instead, the course will mostly focus on the following:

  • Developing in a Wasm/no-std environment, taking cues from web development (Yew uses a similar model to Substrate for example, host functions from external environments etc.),
  • Design patterns that deeply utilize generic typing such as the builder pattern, but also excessive use of trait objects and dynamic dispatching and how they work.
  • Generic and associated types explained, concrete examples of when to use each, and how they may impact various design decisions.
  • Emphasis on procedural macro design and how they work.
  • Use of static vs. dynamic dispatch for typing? How it's useful for our context, this will touch in smart pointers and the use of Box or Rc/RefCell perhaps.
  • Decisions for why you may use specific design patterns, i.e., builder pattern, or any other schemes used in Substrate that are relevant. Of course, emphasis on type safety.
  • Project designs and layout; advanced cargo usage for Wasm/no-std/specialized environments.
  • Understanding advanced workspace and crate usage (more cargo)
  • More emphasis on concrete error handling and runtime panic avoidance.

As an end goal, I think the student should be able to at least understand, if not do, the following:

  • Grasp intermediate usage and literacy of more complex type-design schemes in Rust that involve heavy usage of traits
  • Setup an environment suitable for compiling to a Wasm target
  • Create a library that compiles to Wasm
  • Maybe not be able to create complex macro, but at least understand how they work and how to read their impls
  • Understand how to interface and use that Wasm bundle, and how that relationship works

@shawntabrizi
Copy link
Author

@CrackTheCode016 I very much agree with that direction.

This will change the material from being another source of the same information to being a unique source of new information, which is really worthwhile to go through. Not to mention that users will actually be learning skills which will directly apply to their abilities to write blockchains.

Given this feedback, what would the plan?

Content need not be deleted / removed right away, but I think we can get started by writing these "extension" pages, and then eventually deprecating redundant content.

How best for me to contribute? Do you think I should just write like a new extension page each day on my own?

@CrackTheCode016
Copy link
Contributor

@shawntabrizi I agree with not deleting content, extension pages sound like a great way to go about it. Afterward, we can simply choose what to include (or not to include).

I think we could do the following:

  1. Define an initial list of what pages we want and what topics we want them to cover, which I can do and have ready by Monday.

  2. Dividing the work might be a good idea here. Deciding which topic you feel you could immediately attack might be worthwhile - for example, memory management in Wasm. Highlighting nuances in a topic like that would be helpful.

At this point, I would like to ask about assessments. What format would you recommend? What frequency (after every concept? every module?).

We could simply provide a task with the expected output at the end of every module, and ask the student to complete it (no final project, as I was thinking before).

@CrackTheCode016
Copy link
Contributor

Loosely ordered list of specific topics, some may be combined or shifted around:

  • Generic Design Patterns in Rust

    • Generic & Associated Types - when to use which
    • Trait Objects & Dynamic Dispatching
    • Trait-oriented design
    • Case Study: Builders Pattern
  • Rust Memory Management

    • Heap and Stack
    • Smart Pointers (Box, RC, and Ref)
  • Rust WebAssembly Crash Course

    • Working in a no-std environment (memory management, etc)
    • Defining Wasm functions via host functions
    • Pros & Cons of Wasm, Other Alternatives (polkavm, etc)
    • Error handling & Panic Avoidance
  • Macro Expansion (no pun intended)

    • Understanding macros in Rust
    • Create a procedural macro
    • Create a declarative macro
  • Using and creating Wasm libraries

    • Advanced cargo config (target )
    • Compile a Wasm blob
    • Define host functions and call the Wasm blob
  • Project Structuring?

    • Modules, Crates, Workspaces

@CrackTheCode016
Copy link
Contributor

CrackTheCode016 commented Sep 25, 2023

A couple of updates on this front:

I spent some time looking briefly how Substrate builds Wasm, while that is not too relevant to exactly how Substrate uses wasmtime, I think it is worthwhile to showcase how to deal with some of these APIs, the different factors invovled in both compiling and reading Wasm. I think we could mention and briefly document the 'ecosystem' of different Wasm tools used in and around Substrate.

Should we focus on using a specific runtime, or keep that decoupled? I think using wasmtime as an example could be beneficial, as it portrays the steps required to go from runtime code -> calling via host functions at a lower level, so the relationship can be understood later. Its API is also pretty much the same as wasmi.


On actionable items, if you think it is beneficial, I will work on putting together Wasm-oriented content first - which, you could review, provide more laser-focused content for Substrate on the same PR(s), and start there. This will be the first of the extension pages. Each page would be its own PR for easier review-ability.

WDYT?

@shawntabrizi
Copy link
Author

shawntabrizi commented Sep 25, 2023

i think you should absolutely create a Wasm deep dive, but it should not need to reference Substrate or the Runtime at all. Simply the kinds of things you need to consider when developing for wasm32-unknown-unknown.

I think you should touch on:

  • facts and information about wasm in general
  • rust and wasm as an ecosystem which well support eachother
    • setting up your rust envronment, installing the right toolkits
    • using certain wasm tools in rust: wasm2wat, compact and compressed wasm, etc...
  • execution options
    • interpreted wasm: wasmi - slower, safer
    • compiled wasm: wasmtime - faster, more attackable (creating compile bombs)
  • some of the most basic parts of the wasm compiled code:
    • talking about the imports and exports to the wasm which can be seen in the wat format.
    • the existence of lots of code artifacts, which is why it can be compressed and compactified
  • the non-determinism of wasm compiled code
    • how this can be tricked using a vm like docker with a deterministic setup.
    • showcase the tool we have for this, srtool or something

I think this is a really good start. I think once you have this, then talking to @pepyakin and @athei can help you expand even further (or at least make sure the details are right), but doubtful we will need to cover much more.

@shawntabrizi
Copy link
Author

shawntabrizi commented Sep 25, 2023

For the page "data types".

As mentioned before, I don't think re-covering all the data type stuff is super useful, as the Rust book does it better.

Instead here are some topics we should "extend" basic data types with:

  • non-determinism of floating point numbers, therefore they are not used in runtime
  • how to access u256 and other larger data-types via libraries, and what the downside of using those types are (speed)
  • perhaps a small conversation about why you dont see negative numbers much in blockchain, and link to a deeper discussion in the blockchain lessons.
  • the size of scalar and compound types. build a foundation towards MaxEncodedLen
  • panics for overflows in debug / tests, but not in release / production mode
  • simple traits like AtLeast32BitUnsigned
  • converting between types losslessly and lossy

Probably some more ideas, but this is a good starting point.

@CrackTheCode016
Copy link
Contributor

this is all excellent feedback; I have aggregated this into a single Google doc to provide structure before we start writing up pages of content that we can hit every point we want.

We can define the takeaways and questions for each lesson. The student should be able to answer the question based off of the lesson, provided the takeaways are covered within.

It is currently by invite only, as it is unfinished, but for the record, it is here: https://docs.google.com/document/d/13jsFAsARqsntBF9CaaVs1i_nGYYj2ur9MkVcWvXHXLw/edit?usp=sharing

@CrackTheCode016
Copy link
Contributor

CrackTheCode016 commented Oct 4, 2023

I think we are at a point where these pages can be drafted. As the PRs are created, categorizing them into modules from now might be a good idea, so I will probably open a new issue once that starts that references each subtopic being addressed via a PR. WDYT of this?

My current thinking on this structure is as follows:

Module 1: Rust Basics Overview

  • 1.1 Understanding Alternative Data Types
  • 1.2 Casting & Handling Types
  • 1.3 Common Traits in Rust
  • 1.4 Floating Point Numbers & Safe Math

Module 2: A Comprehensive Look at Rust’s Memory Management Model

  • 2.1 Using Borrowing, References, & Ownership Correctly
  • 2.2 Considerations: Heap & Stack in Rust
  • 2.3 Safe Heap Allocation - Box & RefCell
  • 2.4 Heap Allocation Use Case: Static vs. Dynamic Dispatch

Module 3: Introduction to WebAssembly

  • 3.1 Introduction to WebAssembly
  • 3.2 Working with WebAssembly - Tools & Installation
  • 3.3 Understanding WAT and WASI - the WebAssembly Text Format
  • 3.4 WebAssembly Alternatives

Module 4: Developing WebAssembly Applications with Rust

  • 4.1 Creating & Consuming Wasm Libraries
  • 4.2 Wasm Runtime Comparison
  • 4.3 Memory Management in WebAssembly
  • 4.4 Understanding & Defining Host Functions

Module 5: Introduction to Trait-oriented and Generic Design Patterns in Rust

  • 5.1 Generics vs. Associated Types
  • 5.2 Benefits of Trait-oriented APIs
  • 5.3 Revisiting Dynamic Dispatching with Trait Objects
  • 5.4 Usecase: Creating a generic builder pattern using trait objects

Module 6: Understanding Macros & Attributes

  • 6.1 Macros Usage & Examples
  • 6.2 Create a declarative macro
  • 6.3 Create a procedural macro
  • 6.4 Common Attributes

Module 7: Good Practices: Errors, Bounded Types, & Testing

  • 7.1 Handling Errors Effectively in Rust
  • 7.2 Introducing safe types - BoundedVec, MaxEncodedLen
  • 7.3 Overview of SCALE
  • 7.4 Writing and running unit tests

Module 8: Know Your Cargo: Advanced Cargo Usage, Project Structure, Target Compilation

  • 8.1 Defining a no-std environment
  • 8.2 Modules, crates, & workspaces
  • 8.3 Defining a multi-crate project with Cargo workspaces
  • 8.4 Defining different targets for compilations

@shawntabrizi
Copy link
Author

shawntabrizi commented Nov 18, 2023

I have been working on a new tutorial which approaches teaching users about how specifically the Rust architecture in Substrate came to be:

https://github.com/shawntabrizi/rust-state-machine/commits/master

The tutorial is navigated through the commits of the repo. To update the tutorial I modify the git history. It will turn into a website with interactive code and documentation similar to the original substrate kitties tutorial.

I think this is a strong anchor point to the kind of content on which we should base teaching students about Rust. It can be extended to touch on further topics, but I think actually where it is right now (ending with the introduction of macros) makes for a clean transition to directly using Polkadot-SDK to learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content Addition Suggestions for new content additions.
Projects
None yet
Development

No branches or pull requests

2 participants