Fix for #1860: Closing stdout pipe before function return #1861

monitorjbl · 2017-01-13T18:42:52Z

No description provided.

ttaylorr

This should be safe since the body of the function consumes the reader until scanner.Scan() returns false (meaning the pipe is closed, and the command has already terminated). In the early-exit case, this is safe as well since we don't care about the remainder of the data.

Going to hold off on giving an explicit approval on this since @technoweenie worked on this recently.

technoweenie · 2017-01-13T19:48:38Z

This looks good, but now I'd really like some tests for it :) We should be able to come up with a prune test to trigger #1860.

@monitorjbl: If you want to write a test, that'd be awesome. If not, we'll write one and then merge this. However, I do ask that you sign our CLA :)

monitorjbl · 2017-01-13T20:41:35Z

@technoweenie I'll give writing a test a shot. I'm gonna have to figure out how to generate "old" commits in a repo if you want an integration test like this though.

I did sign the CLA, I'm not sure why the check didn't update. When I click on it again, it says I've already signed it:

technoweenie · 2017-01-13T21:31:44Z

The prune test uses lfstest-testutils to create commits from a custom json document. You can create old commits like this.

Are you logging into the cla app as a different github user? That's the only thing I can think of. I'll see if I can peek at the server too.

monitorjbl · 2017-01-13T21:52:44Z

I was on the right github account, but I made the commit on my work laptop so the email address in the commit history was wrong. Looks like changing that did the trick!

monitorjbl · 2017-01-16T15:57:20Z

@technoweenie It doesn't look like your testing library actually sets the dates on the commits. I ran the pre-existing tests in test/prune-verify.sh with the following options:

KEEPTRASH=1 GIT_LFS_TEST_DIR=/tmp/git-lfs-tests ./test/test-prune.sh

One of the tests in here ("prune verify") looks like its supposed to be making commits that are up to 50 days old. I verified that it outputs JSON that looks like this:

[
  {
    "CommitDate":"2016-11-27T15:48:28Z",
    "Files":[
      {"Filename":"file.dat","Size":28, "Data":"Content for commit 1 (prune)"}]
  },
  {
    "CommitDate":"2016-12-07T15:48:28Z",
    "Files":[
      {"Filename":"file.dat","Size":42, "Data":"Content for commit 2 (prune - fail verify)"}]
  },
  {
    "CommitDate":"2016-12-12T15:48:28Z",
    "Files":[
      {"Filename":"file.dat","Size":28, "Data":"Content for commit 3 (prune)"}]
  },
  {
    "CommitDate":"2016-12-22T15:48:28Z",
    "Files":[
      {"Filename":"file.dat","Size":12, "Data":"HEAD content"}]
  }
]

So, I went to the repo that was generated for one of the tests and looked at the real git log:

$ cd /tmp/git-lfs-tests/test-prune.sh-74530/clone_prune_verify
$ git log

commit 284e764360c0b538a723eb2af6f82e0de51744bb
Author: Git LFS Tests <git-lfs@example.com>
Date:   Mon Jan 16 10:41:10 2017 -0500

    Test commit 3

commit fee71f23920ff0a1ed402ed7f27afe68e45266d0
Author: Git LFS Tests <git-lfs@example.com>
Date:   Mon Jan 16 10:41:10 2017 -0500

    Test commit 2

commit ec75c6e3984033d58c63e1d7a34e5937784ce5de
Author: Git LFS Tests <git-lfs@example.com>
Date:   Mon Jan 16 10:41:10 2017 -0500

    Test commit 1

commit 0459cbd1dc89c3048e65f49cd7469e080858974d
Author: Git LFS Tests <git-lfs@example.com>
Date:   Mon Jan 16 10:41:10 2017 -0500

    Test commit 0

The dates are all the current date. I'm running this on OSX, but it looks like you have a bit of shell script in testhelpers.sh to handle this case.

I'd originally assumed that bufio.Scanner was exhausting the stdout pipe on small repos even if the read loop broke early. That may still be the case, but I wonder if the commit dates not actually being set is why this issue was not found before?

monitorjbl · 2017-01-16T16:57:20Z

Well, scratch that @technoweenie . TIL about GIT_AUTHOR_DATE and GIT_COMMITTER_DATE. Looks like the for-each-ref call that gets made only looks at the commit date. Still, it would be nice if the commits actually looked correct from git log. If you don't mind, I'd like to add the following change to test/testutils.go as part of this PR:

diff --git a/test/testutils.go b/test/testutils.go
index c2bb3e21..bfae1260 100644
--- a/test/testutils.go
+++ b/test/testutils.go
@@ -298,8 +298,10 @@ func commitAtDate(atDate time.Time, committerName, committerEmail, msg string) e
        // set GIT_COMMITTER_DATE environment var e.g. "Fri Jun 21 20:26:41 2013 +0900"
        if atDate.IsZero() {
                env = append(env, "GIT_COMMITTER_DATE=")
+               env = append(env, "GIT_AUTHOR_DATE=")
        } else {
                env = append(env, fmt.Sprintf("GIT_COMMITTER_DATE=%v", git.FormatGitDate(atDate)))
+               env = append(env, fmt.Sprintf("GIT_AUTHOR_DATE=%v", git.FormatGitDate(atDate)))
        }
        cmd.Env = env
        out, err := cmd.CombinedOutput()

technoweenie

This is great work, thanks @monitorjbl 🤘 Nice "git author" catch in the test utils.

technoweenie

Thanks for taking a stab at this. I started leaving review comments, and then noticed something in the (*exec.Cmd) Wait() error func:

c.closeDescriptors(c.closeAfterWait)

If you look in that exec.go file, you can search for closeAfterWait, and see how it adds pipes to that slice, and then closes them after Start() and Wait(). So, it appears that this entire PR isn't even needed. However, the test you wrote in this PR definitely hangs (or at least takes way longer than I'm willing to let a test run ).

My review feedback should still be valid. However, I want to try to dig further into this to understand why this PR is even needed.

technoweenie · 2017-01-18T16:38:16Z

subprocess/cmd.go

Can we name this just Cmd? So *subprocess.Cmd is a wrapper around *exec.Cmd.

technoweenie · 2017-01-18T16:43:18Z

subprocess/cmd.go

Instead of adding the *exec.Cmd as an private field, you can simply embed it in this type.

package subprocess type Cmd struct { *exec.Cmd }

When it has no field name, it means this new struct is embedding the *exec.Cmd. Embedding means any public fields and functions on *exec.Cmd are also on *subprocess.Cmd.

cmd := &subprocess.Cmd{Cmd: &exec.Command("name", "arg")} cmd.Start() // go treats this as `cmd.Cmd.Start()`

This should let you get rid of those other public fields like Path, Args, Env, etc.

The other thing about embedding is it lets you override the functions.

func (c *Cmd) Start() error { fmt.Println("Starting") err := c.Cmd.Start() fmt.Println("started", err) return err }

I actually tried this the first time since I'd read about Go's inheritance-like ability, but I got some pointer errors because I didn't realize you had to initialize the child struct like this. Thanks for the tip, I got it to work!

I spoke a bit too soon. I need to be able to override the *Pipe() functions and call back to the original exec.Cmd function to do so. Fortunately, I can still include a reference to the original exec.Cmd struct while embedding. Cuts the code down quite a bit.

Well, that was a rollercoaster of Go education. I think I got it working the way you originally suggested now.

technoweenie · 2017-01-18T16:44:36Z

subprocess/cmd.go

I think this is the only *exec.Cmd function that needs to be overwritten. Both Output() and CombinedOutput() call Run(), and Run() calls Wait().

I think the *Pipe() functions need to be overridden as well. There's no way to get ahold of the pipes created by exec.Cmd's functions.

technoweenie · 2017-01-18T16:50:11Z

subprocess/cmd.go

Instead of tracking just one of each type, why not just track it as a slice of io.Closer? Functions like StdoutPipe() can just append the pipe to the slice, and Wait()can just iterate through the slice, callingClose()` on each item.

monitorjbl · 2017-01-18T17:17:09Z

I would guess the hang happens at line 434 before the c.closeDescriptors(c.closeAfterWait) gets called. The process can't complete while the pipe still has data.

I'll get these fixes in, thanks for the pointers! Still somewhat new to Go, so any advice is appreciated.

technoweenie · 2017-01-18T17:19:55Z

I would guess the hang happens at line 434 before the c.closeDescriptors(c.closeAfterWait) gets called.

Great catch 🤘 Ok carry on!

Created subprocess.Cmd wrapper to autoclose any created pipes Removing original fix for git-lfs#1860

technoweenie · 2017-01-19T21:36:02Z

Nice, this looks much better 🤘 Thanks for tackling this. Hope your first Go experience was a good one :)

In previous commits in this PR we introduced a new utility command for our test suite, lfstest-genrandom, which reliably produces random data on all the platforms we support (in several forms of base64 encoding or as raw binary data), and then converted much of our test suite to make use of this new command. As noted in one of those commits, one test in our t/t-prune.sh test script makes use of the "uuidgen" command to populate some Git LFS objects with arbitrary text, but as this command does not exist in the Git Bash environment on Windows, the blobs actually have empty data in this case. The test still passes because it doesn't depend on the length of the random data, and the error exit code from the missing "uuidgen" command is ignored. We therefore revise this "prune verify large numbers of refs" test (which was added in commit dd19c1b of PR git-lfs#1861) to call our new lfstest-genrandom command instead of the "uuidgen" command. We use the --base64 flag and a slightly larger size for the output base64-encoded string than the length of a UUID. We also adjust the test so that it does not create the Git LFS objects with sizes that do not match the length of their arbitrary content. (Again, the test passed despite this inconsistency in its test data.)

In previous commits in this PR we introduced a new utility program for our test suite, lfstest-genrandom, which reliably produces random data on all the platforms we support (in several forms of base64 encoding or as raw binary data), and then converted much of our test suite to make use of this new program. As noted in one of those commits, one test in our t/t-prune.sh test script makes use of the "uuidgen" command to populate some Git LFS objects with arbitrary text, but as this command does not exist in the Git Bash environment on Windows, the blobs actually have empty data in this case. The test still passes because it doesn't depend on the length of the random data, and the error exit code from the missing "uuidgen" command is ignored. We therefore revise this "prune verify large numbers of refs" test (which was added in commit dd19c1b of PR git-lfs#1861) to call our new lfstest-genrandom program instead of the "uuidgen" command. We use the --base64 flag and a slightly larger size for the output base64-encoded string than the length of a UUID. We also adjust the test so that it does not create the Git LFS objects with sizes that do not match the length of their arbitrary content. (Again, the test passed despite this inconsistency in its test data.)

In previous commits in this PR we introduced a new utility program for our test suite, lfstest-genrandom, which reliably produces random data on all the platforms we support (in several forms of Base64 encoding, or as raw binary data), and then converted much of our test suite to make use of this new program. As noted in one of those commits, one test in our t/t-prune.sh test script makes use of the uuidgen(1) command to populate some Git LFS objects with arbitrary text, but as this command does not exist in the Git Bash environment on Windows, the objects actually have empty data on that platform. The test still passes because it doesn't depend on the length of the random data, and because the non-zero exit code caused by the missing uuidgen(1) command is ignored as it appears earlier in a pipeline whose final command (our lfstest-testutils program) always succeeds. We therefore revise this "prune verify large numbers of refs" test (which was added in commit dd19c1b of PR git-lfs#1861) to call our new lfstest-genrandom program instead of the uuidgen(1) command. We pass the --base64 flag to our program and request a slightly larger size for the output Base64-encoded string than the length of a UUID, just to ensure we are using a similar amount of random data as before, although this doesn't actually affect the success or failure of the test. We also adjust the test so that it does not create the Git LFS objects with sizes that do not match the length of their arbitrary content. (Again, the test passed despite this inconsistency in its test data.)

monitorjbl mentioned this pull request Jan 13, 2017

lfs prune hangs on old commits #1860

Closed

ttaylorr reviewed Jan 13, 2017

View reviewed changes

monitorjbl force-pushed the master branch from 7cc1f0a to ed8f77a Compare January 13, 2017 21:51

monitorjbl force-pushed the master branch from fbbe360 to ddad782 Compare January 17, 2017 04:40

technoweenie approved these changes Jan 17, 2017

View reviewed changes

technoweenie mentioned this pull request Jan 17, 2017

Ensure subprocess commands with stdin pipes are closed properly #1874

Closed

4 tasks

monitorjbl force-pushed the master branch from 20a5f9c to 730b5c3 Compare January 18, 2017 04:11

monitorjbl added 2 commits January 17, 2017 23:40

Fix for git-lfs#1860: Closing stdout pipe before function return

a9945ba

Adding integration test for git-lfs#1860

dd19c1b

monitorjbl force-pushed the master branch from 730b5c3 to 401f2d0 Compare January 18, 2017 05:06

technoweenie suggested changes Jan 18, 2017

View reviewed changes

monitorjbl force-pushed the master branch 2 times, most recently from ca5abc8 to 318efd4 Compare January 19, 2017 14:37

Fix for git-lfs#1874

be73bc0

Created subprocess.Cmd wrapper to autoclose any created pipes Removing original fix for git-lfs#1860

monitorjbl force-pushed the master branch from 318efd4 to be73bc0 Compare January 19, 2017 14:37

technoweenie approved these changes Jan 19, 2017

View reviewed changes

Merge branch 'master' into master

a311df4

ttaylorr approved these changes Jan 23, 2017

View reviewed changes

technoweenie merged commit 07f0584 into git-lfs:master Jan 23, 2017

technoweenie pushed a commit that referenced this pull request Feb 15, 2017

Backport master from #1861 to release-1.5 [merge conflicts]

a2c3588

technoweenie mentioned this pull request Feb 15, 2017

Backport #1861 for v1.5.x: Fix for #1860: Closing stdout pipe before function return #1944

Closed

Fix for #1860: Closing stdout pipe before function return #1861

Fix for #1860: Closing stdout pipe before function return #1861

Uh oh!

Conversation

monitorjbl commented Jan 13, 2017

Uh oh!

ttaylorr left a comment

Choose a reason for hiding this comment

Uh oh!

technoweenie commented Jan 13, 2017

Uh oh!

monitorjbl commented Jan 13, 2017

Uh oh!

technoweenie commented Jan 13, 2017

Uh oh!

monitorjbl commented Jan 13, 2017

Uh oh!

monitorjbl commented Jan 16, 2017

Uh oh!

monitorjbl commented Jan 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

technoweenie left a comment

Choose a reason for hiding this comment

Uh oh!

technoweenie left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monitorjbl commented Jan 18, 2017

Uh oh!

technoweenie commented Jan 18, 2017

Uh oh!

technoweenie commented Jan 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

monitorjbl commented Jan 16, 2017 •

edited

Loading