-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Fix multi pages search for gh search #10767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pkg/search/searcher.go
Outdated
// otherwise add all the results. | ||
itemsToAdd := min(len(page.Items), toRetrieve) | ||
|
||
result.Total += itemsToAdd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure what this total is.
I assume it's the total number of elements returned.
It used to only return the last page's total which affect the pagination test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Total
is used in displaying the results for communicating how many results out of the total results are being shown.
For example, gh search code
:
cli/pkg/cmd/search/code/code.go
Lines 136 to 163 in 072534c
func displayResults(io *iostreams.IOStreams, results search.CodeResult) error { | |
cs := io.ColorScheme() | |
if io.IsStdoutTTY() { | |
fmt.Fprintf(io.Out, "\nShowing %d of %d results\n\n", len(results.Items), results.Total) | |
for i, code := range results.Items { | |
if i > 0 { | |
fmt.Fprint(io.Out, "\n") | |
} | |
fmt.Fprintf(io.Out, "%s %s\n", cs.Blue(code.Repository.FullName), cs.GreenBold(code.Path)) | |
for _, match := range code.TextMatches { | |
lines := formatMatch(match.Fragment, match.Matches, io) | |
for _, line := range lines { | |
fmt.Fprintf(io.Out, "\t%s\n", strings.TrimSpace(line)) | |
} | |
} | |
} | |
return nil | |
} | |
for _, code := range results.Items { | |
for _, match := range code.TextMatches { | |
lines := formatMatch(match.Fragment, match.Matches, io) | |
for _, line := range lines { | |
fmt.Fprintf(io.Out, "%s:%s: %s\n", cs.Blue(code.Repository.FullName), cs.GreenBold(code.Path), strings.TrimSpace(line)) | |
} | |
} | |
} | |
return nil | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: I have reverted this logic because it causes the X of Y
results header to report --limit
for Y instead of the total number of matching search results.
result.Total = page.Total
Demo
Below, we see this 150-limited code search reporting exactly 150 total matches are found:
$ bin/gh search code "fmt.Fprintf" --repo cli/cli --limit 150
Showing 150 of 150 results
cli/cli script/build.go
fmt.Fprintf(os.Stderr, "Don't know how to build task `%s`.\n", task)
fmt.Fprintf(os.Stderr, "%s: building task `%s` failed.\n", self, task)
with original logic, we see 150 of 161 results
which is the total number of matching search entries:
$ bin/gh search code "fmt.Fprintf" --repo cli/cli --limit 150
Showing 150 of 161 results
cli/cli script/build.go
fmt.Fprintf(os.Stderr, "Don't know how to build task `%s`.\n", task)
fmt.Fprintf(os.Stderr, "%s: building task `%s` failed.\n", self, task)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so it's the total matching the search query... sorry for the useless changes. Would it be worth add a small comment on the field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is a solid call out as it really isn't obvious, yeah 💯
reg.Register(firstReq, firstRes) | ||
reg.Register(secondReq, secondRes) | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only 4 small changes in this tests.
per_page
is 30, both in the header and the second request.
The total for the first and second responses have gone from 2 to 1.
This link to a previous review comment. I assume this total is the number of items in the response.
In that case, it has always been 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leudz : As the .Total
logic changes needed to be reverted, there are going to be some changes to these tests as .Total
is not the number of items in the page, but the total number of matches to the search query.
Additionally, I've made some enhancements around test table data initialization.
Thank you for your patience and efforts to improve the GitHub CLI, @leudz! ❤️ I'm working to provide some initial feedback today so we can get this shaped up and shipped out. 💪 |
This change restores the original logic of passing the search total count logic as is to the result. Additionally, this undoes some of the contributor's formatting changes that increase the changed lines to review.
DemoAfter restoring the
|
This commit is focused on fixing the `searcher` tests for a few reasons: 1. Correcting the `.Total` logic in the previous commit caused changes to tests to fail 2. Tests involving results that exceed the max per page have been improved with new initialize helper, allowing testing table scenarios to be self contained 3. Tests that stub JSON response payloads have been standardized on maps rather than GitHub type primitives (Repository, Issue, Commit, Code, etc) 4. Tests had some minor formatting changes to make them easier to understand and maintain
This commit moves the remaining searcher tests from using JSON marshaled types to using JSON responses for consistency. There appears to be a weird JSON marshaling error with search.Repository that does not map `Name` field in the process. Additionally, the test scenarios around pulling multiple pages beneath the total results have been updated to demonstrate that the REST API returns full pages in both of these cases, which is below the total number of results.
httpStubs: func(reg *httpmock.Registry) { | ||
firstReq := httpmock.QueryMatcher("GET", "search/repositories", url.Values{ | ||
"page": []string{"1"}, | ||
"per_page": []string{"100"}, | ||
"order": []string{"desc"}, | ||
"sort": []string{"stars"}, | ||
"q": []string{"keyword stars:>=5 topic:topic"}, | ||
}) | ||
firstRes := httpmock.JSONResponse(map[string]interface{}{ | ||
"incomplete_results": false, | ||
"total_count": 287, | ||
"items": initialize(0, 100, func(i int) interface{} { | ||
return map[string]interface{}{ | ||
"name": fmt.Sprintf("name%d", i), | ||
} | ||
}), | ||
}) | ||
firstRes = httpmock.WithHeader(firstRes, "Link", `<https://api.github.com/search/repositories?page=2&per_page=100&q=org%3Agithub>; rel="next"`) | ||
secondReq := httpmock.QueryMatcher("GET", "search/repositories", url.Values{ | ||
"page": []string{"2"}, | ||
"per_page": []string{"100"}, | ||
"order": []string{"desc"}, | ||
"sort": []string{"stars"}, | ||
"q": []string{"keyword stars:>=5 topic:topic"}, | ||
}) | ||
secondRes := httpmock.JSONResponse(map[string]interface{}{ | ||
"incomplete_results": false, | ||
"total_count": 287, | ||
"items": initialize(100, 200, func(i int) interface{} { | ||
return map[string]interface{}{ | ||
"name": fmt.Sprintf("name%d", i), | ||
} | ||
}), | ||
}) | ||
reg.Register(firstReq, firstRes) | ||
reg.Register(secondReq, secondRes) | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concern: originally, these tests relied upon the httpmock.Registry
to JSON marshal the *Result
types to JSON, however there is something about the search.Repository
type where Name
field was not being preserved in the marshaling.
Rather than having inconsistent test setups, I aligned the tests on capturing the JSON we expect from the REST API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closing the loop here, it appears that search.Repository
JSON marshaling is a problem because of this logic used for exporting gh search code
as it also contains a repository field:
Lines 356 to 385 in 5d8dbc0
func (repo Repository) ExportData(fields []string) map[string]interface{} { | |
v := reflect.ValueOf(repo) | |
data := map[string]interface{}{} | |
for _, f := range fields { | |
switch f { | |
case "license": | |
data[f] = map[string]interface{}{ | |
"key": repo.License.Key, | |
"name": repo.License.Name, | |
"url": repo.License.URL, | |
} | |
case "owner": | |
data[f] = repo.Owner.ExportData() | |
default: | |
sf := fieldByName(v, f) | |
data[f] = sf.Interface() | |
} | |
} | |
return data | |
} | |
func (repo Repository) MarshalJSON() ([]byte, error) { | |
return json.Marshal(map[string]interface{}{ | |
"id": repo.ID, | |
"nameWithOwner": repo.FullName, | |
"url": repo.URL, | |
"isPrivate": repo.IsPrivate, | |
"isFork": repo.IsFork, | |
}) | |
} |
Given this surprising behavior, the move away from using the Results
type for response mocking makes more sense.
secondRes := httpmock.JSONResponse(map[string]interface{}{ | ||
"incomplete_results": false, | ||
"total_count": 287, | ||
"items": initialize(100, 200, func(i int) interface{} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: while discussing these tests with @jtmcg, Tyler made a call out how the REST API will return a full second page. He suggested initializing this to be a full page, which is closer to the behavior we will encounter in use.
}, | ||
}, | ||
{ | ||
name: "collect full and partial pages under total number of matching search results", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: after talking with @jtmcg, Tyler suggested renaming these test scenarios to explain they are ensuring these commands process REST responses and return a collection of full and partial pages to the user while still being beneath the total number of matching search results.
}) | ||
firstRes := httpmock.JSONResponse(map[string]interface{}{ | ||
"incomplete_results": false, | ||
"total_count": 287, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: the REST search API returns the total number of matching search results on every page. In order to have a number stand out separate from the client-side limit, Tyler suggested 287, which is as good as any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very nice change 👍
@leudz : Just because I did a bit more changes to this PR rather than opening up a ton of suggestions, I've asked a fellow maintainer to be the final PR reviewer as I'd value a 2nd set of eyes. 🙇 Thank you again for your patience! |
- update local variables to communicate what they are - added docblock explaining search results populated
@andyfeller I've added a few comments for Total like we talked about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! ✨ The problem and fix make sense, and this branch does fix the problem when I tested it out myself.
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [cli/cli](https://github.com/cli/cli) | minor | `v2.69.0` -> `v2.72.0` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>cli/cli (cli/cli)</summary> ### [`v2.72.0`](https://github.com/cli/cli/releases/tag/v2.72.0): GitHub CLI 2.72.0 [Compare Source](cli/cli@v2.71.2...v2.72.0) ####Accessibility public preview This release marks the public preview of several accessibility improvements to the GitHub CLI that have been under development over the past year in partnership with our friends at [Charm](https://github.com/charmbracelet) including: - customizable and contrasting colors - non-interactive user input prompting - text-based spinners These new experiences are captured in a new `gh a11y` help topic command, which goes into greater detail into the motivation behind each of them as well as opt-in configuration settings / environment variables. We would like you to share your feedback and join us on this journey through one of [GitHub Accessibility feedback channels](https://accessibility.github.com/feedback)! 🙌 #### What's Changed ##### ✨ Features - Introduce `gh accessibility` help topic highlighting GitHub CLI accessibility experiences by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10890 - \[gh pr view] Support `closingIssuesReferences` JSON field by [@​iamazeem](https://github.com/iamazeem) in cli/cli#10544 ##### 🐛 Fixes - Fix expected error output of `TestRepo/repo-set-default` by [@​aconsuegra](https://github.com/aconsuegra) in cli/cli#10884 - Ensure accessible password and auth token prompters disable echo mode by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10885 - Fix: Accessible multiselect prompt respects default selections by [@​BagToad](https://github.com/BagToad) in cli/cli#10901 #### New Contributors - [@​aconsuegra](https://github.com/aconsuegra) made their first contribution in cli/cli#10884 **Full Changelog**: cli/cli@v2.71.2...v2.72.0 ### [`v2.71.2`](https://github.com/cli/cli/releases/tag/v2.71.2): GitHub CLI 2.71.2 [Compare Source](cli/cli@v2.71.1...v2.71.2) #### What's Changed - Fix pr create when push.default tracking and no merge ref by [@​williammartin](https://github.com/williammartin) in cli/cli#10863 **Full Changelog**: cli/cli@v2.71.1...v2.71.2 ### [`v2.71.1`](https://github.com/cli/cli/releases/tag/v2.71.1): GitHub CLI 2.71.1 [Compare Source](cli/cli@v2.71.0...v2.71.1) #### What's Changed - Fix pr create when branch name contains slashes by [@​williammartin](https://github.com/williammartin) in cli/cli#10859 **Full Changelog**: cli/cli@v2.71.0...v2.71.1 ### [`v2.71.0`](https://github.com/cli/cli/releases/tag/v2.71.0): GitHub CLI 2.71.0 [Compare Source](cli/cli@v2.70.0...v2.71.0) #### What's Changed ##### ✨ Features - `gh pr create`: Support Git's `@{push}` revision syntax for determining head ref by [@​BagToad](https://github.com/BagToad) in cli/cli#10513 - Introduce option to opt-out of spinners by [@​BagToad](https://github.com/BagToad) in cli/cli#10773 - Update configuration support for accessible colors by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10820 - `gh config`: add config settings for accessible prompter and disabling spinner by [@​BagToad](https://github.com/BagToad) in cli/cli#10846 ##### 🐛 Fixes - Fix multi pages search for gh search by [@​leudz](https://github.com/leudz) in cli/cli#10767 - Fix: `project` commands use shared progress indicator by [@​BagToad](https://github.com/BagToad) in cli/cli#10817 - Issue commands should parse args early by [@​williammartin](https://github.com/williammartin) in cli/cli#10811 - Feature detect v1 projects on `issue view` by [@​williammartin](https://github.com/williammartin) in cli/cli#10813 - Feature detect v1 projects on non web-mode `issue create` by [@​williammartin](https://github.com/williammartin) in cli/cli#10815 - Feature detect v1 projects on web mode issue create by [@​williammartin](https://github.com/williammartin) in cli/cli#10818 - Feature detect v1 projects on issue edit by [@​williammartin](https://github.com/williammartin) in cli/cli#10819 ##### 📚 Docs & Chores - Refactor Sigstore verifier logic by [@​malancas](https://github.com/malancas) in cli/cli#10750 #####
Dependencies - chore(deps): bump github.com/sigstore/sigstore-go from 0.7.1 to 0.7.2 by [@​dependabot](https://github.com/dependabot) in cli/cli#10787 - Bump google.golang.org/grpc from 1.71.0 to 1.71.1 by [@​dependabot](https://github.com/dependabot) in cli/cli#10758 #### New Contributors - [@​leudz](https://github.com/leudz) made their first contribution in cli/cli#10767 **Full Changelog**: cli/cli@v2.70.0...v2.71.0 ### [`v2.70.0`](https://github.com/cli/cli/releases/tag/v2.70.0): GitHub CLI 2.70.0 [Compare Source](cli/cli@v2.69.0...v2.70.0) #### Accessibility This release contains dark shipped changes that are part of a larger GitHub CLI accessibility preview still under development. More information about these will be announced later this month including various channels to work with GitHub and GitHub CLI maintainers on shaping these experiences. ##### Ensure table headers are thematically contrasting [#​8292](cli/cli#8292) is a long time issue where table headers were difficult to see in terminals with light background. Ahead of the aforementioned preview, `v2.70.0` has shipped changes that improve the out-of-the-box experience based on terminal background detection. The following screenshots demonstrate the Mac Terminal using the Basic profile, which responds to user's appearance preferences: <img width="1512" alt="Screenshot of gh repo list in light background terminal" src="https://github.com/user-attachments/assets/87413dde-eec8-43eb-9c16-dc84f8249ddf" /> <img width="1512" alt="Screenshot of gh repo list in dark background terminal" src="https://github.com/user-attachments/assets/7430b42c-7267-402b-b565-a296beb4d5ea" /> For more information including demos from various official distributions, see [#​10649](cli/cli#10649). #### What's Changed ##### ✨ Features - Update go-gh and document available sprig funcs by [@​BagToad](https://github.com/BagToad) in cli/cli#10680 - Introducing experimental support for rendering markdown with customizable, accessible colors by [@​andyfeller](https://github.com/andyfeller) [@​jtmcg](https://github.com/jtmcg) in cli/cli#10680 - Ensure table datetime columns have thematic, customizable muted text by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10709 - Ensure table headers are thematically contrasting by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10649 - Introduce configuration setting for displaying issue and pull request labels in rich truecolor by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10720 - Ensure muted text is thematic and customizable by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10737 - \[gh repo create] Show host name in repo creation prompts by [@​iamazeem](https://github.com/iamazeem) in cli/cli#10516 - Introduce accessible prompter for screen readers (preview) by [@​BagToad](https://github.com/BagToad) in cli/cli#10710 ##### 🐛 Fixes - `run list`: do not fail on organization/enterprise ruleset imposed workflows by [@​BagToad](https://github.com/BagToad) in cli/cli#10660 - Implement safeguard for `gh alias delete` test, prevent wiping out GitHub CLI configuration by [@​andyfeller](https://github.com/andyfeller) in cli/cli#10683 - Pin third party actions to commit sha by [@​BagToad](https://github.com/BagToad) in cli/cli#10731 - Fallback to job run logs when step logs are missing by [@​babakks](https://github.com/babakks) in cli/cli#10740 - \[gh ext] Fix `GitKind` extension directory path by [@​iamazeem](https://github.com/iamazeem) in cli/cli#10609 - Fix job log resolution to skip legacy logs in favour of normal/new ones by [@​babakks](https://github.com/babakks) in cli/cli#10769 ##### 📚 Docs & Chores - `./script/sign` cleanup by [@​iamazeem](https://github.com/iamazeem) in cli/cli#10599 - Fix typos in CONTRIBUTING.md by [@​rylwin](https://github.com/rylwin) in cli/cli#10657 - Improve `gh at verify --help`, document json output by [@​phillmv](https://github.com/phillmv) in cli/cli#10685 - Acceptance test issue/pr create/edit with project by [@​williammartin](https://github.com/williammartin) in cli/cli#10707 - Escape dots in regexp pattern in `README.md` by [@​babakks](https://github.com/babakks) in cli/cli#10742 - Simplify cosign verification example by not using a regex. by [@​kommendorkapten](https://github.com/kommendorkapten) in cli/cli#10759 - Document UNKNOWN STEP in run view by [@​williammartin](https://github.com/williammartin) in cli/cli#10770 #####
Dependencies - Update github.com/sigstore/sigstore-go to 0.7.1 and fix breaking function change by [@​malancas](https://github.com/malancas) in cli/cli#10749 #### New Contributors - [@​rylwin](https://github.com/rylwin) made their first contribution in cli/cli#10657 **Full Changelog**: cli/cli@v2.69.0...v2.70.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yNTkuMCIsInVwZGF0ZWRJblZlciI6IjM5LjI2NC4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
This fixes how
gh search
retrieves items when more than 100 items are requested and not a multiple of 100.When using pagination, two variables are used: page and perPage. Pages can contain a maximum of 100 items.
The previous implementation tried to save the API some work by requesting less items for the last page.
But it didn't handle pagination correctly and ended up requesting some items a second time.
For example if 150 items are queried, the first page returns items 0->99.
Then it would try to query 50 items by asking for page 2 with 50 per page.
But this would return items 50->99 which isn't what we want.
This new version instead picks a fixed amount per page, in this case 100.
The first page returns items 0->99 and the second 100->199.
Then we discard items 150->199.
When less than 100 items are requested there is no change, we only request what we need.
In theory it's possible to both save the API some work and keep track of pagination but that would be more complex.
Fixes #9749