C# · 12月 23, 2021

c# – 如何避免长生命字符串导致第2代垃圾收集

我有一个应用程序,我将日志字符串保存在循环缓冲区中.当日志变满时,对于每个新插入,旧的字符串将被释放用于垃圾收集,然后它们在第2代内存中.因此,最终将发生第2代GC,我想避免.

我试图将字符串编组成一个结构.令人惊讶的是,我仍然得到第2代GC:s.结构似乎仍然保留了对字符串的引用.完整的控制台应用程序任何帮助赞赏.

using System;using System.Collections.Generic;using System.Diagnostics;using System.Linq;using System.Runtime.InteropServices;using System.Text;using System.Threading.Tasks;namespace ConsoleApplication{ class Program { [StructLayout(LayoutKind.Sequential)] public struct FixedString { [MarshalAs(UnmanagedType.ByValTStr,SizeConst = 256)] private string str; public FixedString(string str) { this.str = str; } } [StructLayout(LayoutKind.Sequential)] public struct UTF8PackedString { private int length; [MarshalAs(UnmanagedType.ByValArray,SizeConst = 256)] private byte[] str; public UTF8PackedString(int length) { this.length = length; str = new byte[length]; } public static implicit operator UTF8PackedString(string str) { var obj = new UTF8PackedString(Encoding.UTF8.GetByteCount(str)); var bytes = Encoding.UTF8.GetBytes(str); Array.Copy(bytes,obj.str,obj.length); return obj; } } const int BufferSize = 1000000; const int LoopCount = 10000000; static void Main(string[] args) { Console.WriteLine(“{0}\t{1}\t{2}\t{3}\t{4}”,”Type”.PadRight(20),”Time”,”GC(0)”,”GC(1)”,”GC(2)”); Console.WriteLine(); for (int i = 0; i < 5; i++) { TestPerformance<string>(s => s); TestPerformance<FixedString>(s => new FixedString(s)); TestPerformance<UTF8PackedString>(s => s); Console.WriteLine(); } Console.ReadKey(); } private static void TestPerformance<T>(Func<string,T> func) { var buffer = new T[BufferSize]; GC.Collect(2); Stopwatch stopWatch = new Stopwatch(); var initialCollectionCounts = new int[] { GC.CollectionCount(0),GC.CollectionCount(1),GC.CollectionCount(2) }; stopWatch.Reset(); stopWatch.Start(); for (int i = 0; i < LoopCount; i++) buffer[i % BufferSize] = func(i.ToString()); stopWatch.Stop(); Console.WriteLine(“{0}\t{1}\t{2}\t{3}\t{4}”,typeof(T).Name.PadRight(20),stopWatch.ElapsedMilliseconds,(GC.CollectionCount(0) – initialCollectionCounts[0]),(GC.CollectionCount(1) – initialCollectionCounts[1]),(GC.CollectionCount(2) – initialCollectionCounts[2]) ); } }}

编辑:使用执行所需工作的UnsafeFixedString更新代码:

using System;using System.Collections.Generic;using System.Diagnostics;using System.Linq;using System.Runtime.InteropServices;using System.Text;using System.Threading.Tasks;namespace ConsoleApplication{ class Program { public unsafe struct UnsafeFixedString { private int length; private fixed char str[256]; public UnsafeFixedString(int length) { this.length = length; } public static implicit operator UnsafeFixedString(string str) { var obj = new UnsafeFixedString(str.Length); for (int i = 0; i < str.Length; i++) obj.str[i] = str[i]; return obj; } } const int BufferSize = 1000000; const int LoopCount = 10000000; static void Main(string[] args) { Console.WriteLine(“{0}\t{1}\t{2}\t{3}\t{4}”,”GC(2)”); Console.WriteLine(); for (int i = 0; i < 5; i++) { TestPerformance(s => s); TestPerformance<UnsafeFixedString>(s => s); Console.WriteLine(); } Console.ReadKey(); } private static void TestPerformance<T>(Func<string,GC.CollectionCount(2) }; stopWatch.Reset(); stopWatch.Start(); for (int i = 0; i < LoopCount; i++) buffer[i % BufferSize] = func(String.Format(“{0}”,i)); stopWatch.Stop(); Console.WriteLine(“{0}\t{1}\t{2}\t{3}\t{4}”,(GC.CollectionCount(2) – initialCollectionCounts[2]) ); } }}

我的电脑输出是:

Type Time GC(0) GC(1) GC(2)String 5746 160 71 19UnsafeFixedString 5345 418 0 0解决方法 带字符串字段的结构在这里有所不同应该不足为奇:字符串字段总是简单地引用托管堆上的对象 – 特别是某个字符串对象.该字符串仍然存在,最终仍会导致GC2.

“修复”这个问题的唯一方法就是不要把它作为一个对象;并且唯一的方法(不完全超出托管内存)是使用固定缓冲区:

public unsafe struct FixedString{ private fixed char str[100];}

这里,每个结构实例FixedString都有200个字节为数据保留. str只是char *的相对偏移量,表示此预留的开始.但是,使用它是棘手的 – 并且需要始终不安全的代码.另请注意,无论您是否确实要存储3个字符或170,每个FixedString都会保留相同的空间量.为避免内存问题,您可能需要使用null-teriminator,或单独存储有效负载长度.

请注意,在.NET 4.5中,<gcAllowVeryLargeObjects>支持可以使这些值具有相当大的数组(例如,FixedString []) – 但请注意,您不希望经常复制数据.为避免这种情况,您可能希望始终允许数组中的备用空间(因此您不要仅仅添加一个项目来复制整个数组),并通过ref处理单个项目,即

FixedString[] data = …int index = …ProcessItem(ref data[index]);void ProcessItem(ref FixedString item) { // …}

这里的item直接与数组中的元素对话 – 我们没有在任何时候复制数据.

现在我们只有一个对象 – 数组本身.