Add per-source telemetry for package IDs containing non-alphanumeric, dash, dot, or underscore characters#7213
Add per-source telemetry for package IDs containing non-alphanumeric, dash, dot, or underscore characters#7213
Conversation
… per feed source Co-authored-by: jeffkl <17556515+jeffkl@users.noreply.github.com>
Co-authored-by: jeffkl <17556515+jeffkl@users.noreply.github.com>
Co-authored-by: jeffkl <17556515+jeffkl@users.noreply.github.com>
7e34992 to
79664c2
Compare
|
@copilot |
… PublicAPI files Agent-Logs-Url: https://github.com/NuGet/NuGet.Client/sessions/99e74a70-9115-4726-adbb-4b262f30d6da Co-authored-by: nkolev92 <2878341+nkolev92@users.noreply.github.com>
Head branch was pushed to by a user without write access
Fixed in 5a894ff. The changes incorporate dev's nullability Phase 1 work for
Build: 0 warnings, 0 errors. |
| /// <summary> | ||
| /// Gets the package ID of the copied nupkg, or <see langword="null"/> if not available. | ||
| /// </summary> | ||
| public string? PackageId { get; } |
There was a problem hiding this comment.
@jeffkl This can be non null if we force copilot to just "break" the API potentially.
We can also just force is to annotate as non-null on purpose and annotate the old constructor as null and default the package ID to string.Empty?
I tend to prefer non-null whenever possible so I'm fine with the breaking change.
cc @zivkan
There was a problem hiding this comment.
Plus "ProtocolDiagnostics" were always intended to be used by our telemetry. Since we refuse to use InternalsVisibleTo for anything other than tests, we have no technical means to make it internal to NuGet across product assembly boundaries. But for this reason, I'm much less concerned about breaking changes in ProtocolDiagnostics than I am about any other API.
There was a problem hiding this comment.
Since we refuse to use InternalsVisibleTo for anything other than tests, we have no technical means to make it internal to NuGet across product assembly boundaries
This has been a sanity policy that tends to work super well as a general rule.
That being said, exposing InternalsVisibileTo to projects that never ship as packages, probably won't break anything.
There was a problem hiding this comment.
The problem is that InternalsVisibleTo doesn't allow you to limit it to just ProtocolDiagnostics or a limited list of types. It makes every internal modifier available. And when using Intellisense or similar from that target assembly, giving you a list of available types and autocomplete, you won't know if the NuGet.Protocol type is internal or public, making it too easy to use types that we really don't want to be used outside NuGet.Protocol. So, I don't like the idea of using InternalsVisibleTo some of our VS assembles.
Bug
Fixes: NuGet/Home#14212
Description
Adds telemetry to track whether any package ID contains characters outside
[A-Za-z0-9.\-_](i.e., anything other than ASCII letters, digits,., or-,_). This is emitted in two places:PackageSourceDiagnosticsevent) — correlates non-standard ID usage to specific feed providersProjectRestoreInformationevent) — indicates whether any resolved package in the restore graph has a non-standard IDProtocol layer
PackageIdproperty toProtocolDiagnosticNupkgCopiedEvent(nullable; new 3-param constructor, old 2-param delegates to it)FindPackagesByIdNupkgDownloader→identity.IdLocalPackageArchiveDownloader→_packageIdentity.IdLocalV3FindPackageByIdResource/LocalV2FindPackageByIdResource→idparameterPublicAPI.Unshipped.txtfilesPer-source telemetry (
PackageSourceTelemetry)Dataclass: newIdContainsNonAsciiCharacterbool, sticky-true once any non-standard ID is seen for that sourceReadOnlySpan<char>character loop (local functionHasNonASCIICharacters)AddNupkgCopiedData: skips check whenPackageIdis null (events raised by old constructor)ToTelemetryAsync: emitsnupkgs.idcontainsnonasciicharacteralongside the existingnupkgs.copied/nupkgs.bytesPer-restore telemetry (
RestoreCommand)AnyPackageIdContainsNonASCIICharactersproperty on theProjectRestoreInformationtelemetry eventTests
PackageSourceTelemetryTests— Existing tests coveringAddNupkgCopiedDatafor: all-standard IDs →false; non-standard IDs (underscore, Unicode, space,@,+) →true; mixed batch with one non-standard → stickytrue; nullPackageId→false.ToTelemetry_WithData_CreatesTelemetryPropertiesasserts the new property is emitted.RestoreCommandTests— New tests:ExecuteAsync_WithASCIIPackageId_AnyPackageIdContainsNonASCIICharactersIsFalse— standard[A-Za-z0-9.-]package →falseExecuteAsync_WithNonASCIIPackageId_AnyPackageIdContainsNonASCIICharactersIsTrue—[Theory]with underscore, Kelvin sign (U+212A), Greek alpha (U+03B1), accented Latin (U+00E9), Cyrillic (U+0410) →trueExecuteAsync_WithMixedPackageIds_AnyPackageIdContainsNonASCIICharactersIsTrue—[Theory]where direct dep is standard but transitive dep has non-standard chars →truePR Checklist