<!-- 🚨 Please Do Not skip any instructions and information mentioned below as they are all required and essential to evaluate and test the PR. By fulfilling all the required information you will be able to reduce the volume of questions and most likely help merge the PR faster 🚨 -->
<!-- 📝 It is preferred if you keep the "☑️ Allow edits by maintainers" checked in the Pull Request Template as it increases collaboration with the Toolkit maintainers by permitting commits to your PR branch (only) created from your fork. This can let us quickly make fixes for minor typos or forgotten StyleCop issues during review without needing to wait on you doing extra work. Let us help you help us! 🎉 -->
## Follow up from #3520
<!-- Add the relevant issue number after the "#" mentioned above (for ex: Fixes#1234) which will automatically close the issue once the PR is merged. -->
<!-- Add a brief overview here of the feature/bug & fix. -->
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
- Optimization
<!-- - Bugfix -->
<!-- - Feature -->
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
The codegen for the second branch in `RuntimeHelpers.ConvertLength` does a signed division:
9b75c9f910/Microsoft.Toolkit.HighPerformance/Helpers/Internals/RuntimeHelpers.cs (L43-L46)
This is not the best for the codegen, as the JIT has to handle the sign in that division, resulting in the following:
```asm
; [System.Byte, System.Private.CoreLib],[System.Numerics.Vector4, System.Numerics.Vectors]
ConvertLength[TFrom, TTo](Int32)
L0000: mov eax, ecx
L0002: sar eax, 0x1f
L0005: and eax, 0xf
L0008: add eax, ecx
L000a: sar eax, 4
L000d: ret
```
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
Avoided that with a cast to `uint`, since the length is guaranteed to be a positive value in `[0, int.MaxValue]` anyway:
```asm
; [System.Byte, System.Private.CoreLib],[System.Numerics.Vector4, System.Numerics.Vectors]
L0000: mov eax, ecx
L0002: shr eax, 4
L0005: ret
```
Perfect! 😄🎉
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes
## Adds the .NET 5 target to `Microsoft.Toolkit.HighPerformance`
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
<!-- - Bugfix -->
- Feature
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
The `Microsoft.Toolkit.HighPerformance` package maxes out at .NET Core 3.1.
The `Microsoft.Toolkit` package maxes out at .NET Standard 2.1.
Additionally, `Microsoft.Toolkit` doesn't have proper nullability annotations, and it reports installing additional dependencies if installed in a .NET 5 apps. The extra dependency is `System.Runtime.CompilerServices.Unsafe` which is actually built-in on .NET 5, but consumers not aware of this would still see the installation prompt from NuGet as reporting an extra indirect dependency.
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
✅ Added .NET 5 target to `Microsoft.Toolkit.HighPerformance`
✅ Added .NET 5 target to `Microsoft.Toolkit`
✅ Enabled global nullability annotations to `Microsoft.Toolkit` and improved the codebase.
✅ Enabled C# 9 in both projects, with some extra code tweaks.
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->
- [ ] Sample in sample app has been added / updated (for bug fixes / features)
- [ ] Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)
- [ ] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes
<!-- If this PR contains a breaking change, please describe the impact and migration path for existing applications below.
Please note that breaking changes are likely to be rejected. -->
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
- Performance improvement
<!-- - Bugfix -->
<!-- - Feature -->
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
About 20% improvement on .NET 5 when working on `char` types (or larger):
![image](https://user-images.githubusercontent.com/10199417/97509236-ff526e80-1981-11eb-8a90-f8aa72f1551e.png)
This was done by adding an unrolled loop for the vectorized path of the SIMD accelerated version of `Count<T>`.
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
<!-- - Bugfix -->
- Feature
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
There is currently no way to interoperate between the `IBufferWriter<T>` interface and the `Stream` class. Many APIs in the BCL and in 3rd party libraries use `Stream` as the standard way to accept an instance that can be written to or read from, and there is no built-in way to have a memory stream that is also using memory pooling, because none of the types in the BCL and in the `HighPerformance` package currently support both features at the same time. This PR fixes that 😄🚀
Consider this example that I saw from a user in the C# Discord server:
```csharp
public byte[] Compress(byte[] source)
{
MemoryStream output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(output, CompressionLevel.Optimal))
{
dstream.Write(source, 0, source.Length);
}
return output.ToArray();
}
public byte[] Decompress(byte[] source)
{
MemoryStream input = new MemoryStream(source);
MemoryStream output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(input, CompressionMode.Decompress))
{
dstream.CopyTo(output);
}
return output.ToArray();
}
```
You can see how the code is very memory inefficient: the `MemoryStream` type will just `new`-up arrays as it goes, and at the end `ToArray()` is used too, which will duplicate the arrays too. Even by removing that, the main issue within `MemoryStream` remains. With the new extension introduced in this PR, these two APIs can be rewritten much more efficiently, like this:
```csharp
public IMemoryOwner<byte> Compress(ReadOnlySpan<byte> span)
{
ArrayPoolBufferWriter<byte> bufferWriter = new ArrayPoolBufferWriter<byte>();
using DeflateStream deflateStream = new DeflateStream(bufferWriter.AsStream(), CompressionLevel.Optimal);
deflateStream.Write(span);
return bufferWriter;
}
public IMemoryOwner<byte> Decompress(ReadOnlyMemory<byte> memory)
{
ArrayPoolBufferWriter<byte> bufferWriter = new ArrayPoolBufferWriter<byte>(memory.Length);
using DeflateStream deflateStream = new DeflateStream(memory.AsStream(), CompressionMode.Decompress);
deflateStream.CopyTo(bufferWriter.AsStream());
return bufferWriter;
}
```
Which heavily leverages all the various APIs and helpers in the `HighPerformance` package, and gives us the following results:
| Method | Categories | Mean | Error | StdDev | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |----------- |------------:|----------:|----------:|------:|---------:|---------:|---------:|----------:|
| new[] | COMPRESS | 29,923.5 us | 174.19 us | 162.94 us | 1.00 | 312.5000 | 312.5000 | 312.5000 | 3089853 B |
| **pool** | COMPRESS | **29,116.0 us** | 120.55 us | 106.87 us | **0.97** | - | - | - | **297 B** |
| | | | | | | | | | |
| new[] | DECOMPRESS | 832.9 us | 9.96 us | 8.83 us | 1.00 | 337.8906 | 336.9141 | 336.9141 | 2966680 B |
| **pool** | DECOMPRESS | **119.6 us** | 0.70 us | 0.62 us | **0.14** | - | - | - | **392 B** |
This benchmark compresses and decompresses a 1MB buffer, using the two methods detailed above.
You can see the vastly reduced memory allocations using the pooled writer backed stream 🚀
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
This PR introduces this new extension:
```csharp
namespace Microsoft.Toolkit.HighPerformance.Extensions
{
public static class ArrayPoolBufferWriterExtensions
{
public static Stream AsStream(this ArrayPoolBufferWriter<byte> writer);
}
public static class IBufferWriterExtensions
{
public static Stream AsStream(this IBufferWriter<byte> writer);
}
}
```
Which helps to interoperate between the `IBufferWriter<T>` interface and the `Stream` class. In particular, since the `HighPerformance` package includes the `ArrayPoolBufferWriter<T>` type, this extension allows users to use that as a `Stream`, and then keep working with the resulting `ReadOnlyMemory<T>` produced by that type, as shown above.
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
- Feature
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
Right now there is no (easy) way to cast a `Memory<TFrom>` instance to a `Memory<TTo>` instance. There are APIs to to do that for `Span<T>` instances, but not for `Memory<T>`. The reason for that is that with a `Span<T>` it's just a matter of retrieving the wrapped reference, reinterpreting it and then adjusting the size, then creating a new `Span<T>` instance. But a `Memory<T>` instance is completely different: it wraps an object which could be either a `T[]` array, a `MemoryManager<T>` instance, etc. The result is that currently there are no APIs in the BCL nor in the toolkit to just "cast" a `Memory<T>`.
This feature has been requested by a number of developers, including in a well known library such as `ImageSharp`:
> Yes, that's exactly what I would need. But I'm wondering how would you implement it.
> It's certainly non trivial to cast a `Memory<byte>` to a `Memory<TPixel>` and if there's an API for that I would gladly want to know...
> So I pressume `ImageSharp` would need to do some work under the hood.
(_`ImageSharp` issue, [here](https://github.com/SixLabors/ImageSharp/issues/1097#issuecomment-580639914)_)
To solve that, I created a very simplified version of the code included in this PR, into a PR [here](https://github.com/SixLabors/ImageSharp/pull/1314).
Having this available right out of the box in the `HighPerformance` package would be helpful in a number of similar situations, especially with `Memory<T>` APIs becoming more and more common across libraries now (as they've been out for a while).
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
This PR includes 4 new extensions for the `Memory<T>` and `ReadOnlyMemory<T>` types that enable the following:
```csharp
// Cast between two Memory<T> instances...
Memory<byte> memoryOfBytes = new byte[128].AsMemory();
Memory<float> memoryOfFloats = memoryOfBytes.Cast<byte, float>();
// ...any number of times is needed
Memory<int> memoryOfInts = memoryOfFloats.Cast<float, int>();
Memory<byte> backToBytesMemory = memoryOfInts.Cast<int, byte>();
// Or just convert into bytes directly
Memory<int> sourceAsInts = new int[128].AsMemory();
Memory<byte> sourceAsBytes = sourceAsInts.AsBytes();
// Want to get a stream from a string? Why not! 😄
using (Stream stream = "Hello world".AsMemory().AsBytes().AsStream())
{
// Use the stream here, which reads *directly* from the string data!
}
```
Here is the full list of the new APIs introduced in this PR:
```csharp
namespace Microsoft.Toolkit.HighPerformance.Extensions
{
public static class MemoryExtensions
{
public static Memory<byte> AsBytes<T>(this Memory<T> memory)
where T : unmanaged;
public static Memory<TTo> Cast<TFrom, TTo>(this Memory<TFrom> memory)
where TFrom : unmanaged
where TTo : unmanaged;
}
public static class ReadOnlyMemoryExtensions
{
public static ReadOnlyMemory<byte> AsBytes<T>(this ReadOnlyMemory<T> memory)
where T : unmanaged;
public static ReadOnlyMemory<TTo> Cast<TFrom, TTo>(this ReadOnlyMemory<TFrom> memory)
where TFrom : unmanaged
where TTo : unmanaged;
}
}
```
## Notes
Marking as draft as this is still being worked on, but feedbacks and reviews are welcome! 😄
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes
## PR Type
What kind of change does this PR introduce?
- Feature
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
There is currently no way to get the underlying `T[]` array from a `MemoryOwner<T>` or `SpanOwner<T>` instance without going through some hoops that are very inconvenient (and which are only possible for `MemoryOwner<T>`). Being able to use the array directly is necessary when working with some older APIs that don't offer a `Span<T>` or `Memory<T>` overload.
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
This PR introduces a new `DangerousGetArray` method that mirrors the `MemoryMarshal.TryGetArray` method and works on `MemoryOwner<T>` and `SpanOwner<T>` instances. I've removed the try pattern since here the types guarantee that the underlying memory store will always be an array. The methods are called `Dangerous___` because using the array is potentially dangerous in case a user keeps the array after disposing the original owner, as it means that that array might've been rented to some other consumer, so using it could lead to unexpected behavior. The methods are not inherently dangerous per se.
## API surface
```csharp
namespace Microsoft.Toolkit.HighPerformance.Buffers
{
public sealed class MemoryOwner<T>
{
public ArraySegment<T> DangerousGetArray();
}
public readonly ref struct SpanOwner<T>
{
public ArraySegment<T> DangerousGetArray();
}
}
```
## Example usage
Suppose we have a `Person` class with `string Name`, `string Surname` and `int Age` properties, and we want to calculate an MD5 hash with the current state of the class. This was originally asked by a user in the C# Discord server ([here](https://discordapp.com/channels/143867839282020352/312132327348240384/766694351383560205)).
```csharp
public static string GetMD5Hash(Person person)
{
using var buffer = new ArrayPoolBufferWriter<byte>();
buffer.Write<char>(person.Name);
buffer.Write<char>(person.Surname);
buffer.Write(person.Age);
using SpanOwner<byte> hash = SpanOwner<byte>.Allocate(16);
using var md5 = MD5.Create();
md5.TryComputeHash(buffer.WrittenSpan, hash.Span, out _);
return BitConverter.ToString(hash.DangerousGetArray().Array!, 0, 16);
}
```
You can see how here we can leverage the new `DangerousGetArray` API to get the underlying array to use with the `BitConverter.ToString` API, which doesn't have an overload accepting a `ReadOnlySpan<byte>`. The same goes for many other existing APIs that only accept an array as input data instead of the new memory APIs.
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changes