KEMBAR78
Migrating To GitHub | PPTX
Migrating to GitHub
DevOps Live Meetup
September 28, 2016
Sridhar Peddinti
Syed Abbas
Before we begin…
9/30/2016 Copyright 2
THE FOLLOWING PRESENTATION IS SOLELY BASED ON OUR EXPERIENCE FOR THE FANS
OF GITHUB REPOSITORY.
MOST OF THE MATERIAL REFERENCED IN THE FOLLOWING SLIDES IS VALID FOR
GITHUB OR ANY GIT BASED REPOSITORIES.
ANY STATEMENTS THAT MIGHT DEMEAN SVN, ACCUREV OR OTHER REPOSITORIES IS
PURELY COINCIDENTAL.
WE LOVE ALL SOURCE CODE REPOSITORIES…LONG LIVE SOFTWARE DEVELOPMENT,
MAY THERE NEVER BE TIME MACHINES GENERATE CODE.
Issued in public interest www.newtglobal.com
Motivators
9/30/2016 Copyright 3
• Managers and Architects like the Distributed and offline feature of GitHub
• Enforces discipline on developers part
• Technical savvy groups already using GitHub (Git Bridges)
• Developers hate the change in the beginning but they are quick learners and like to
adapt with rest of the industry.
• Integration with other DevOps tool kits is a motivator
• Naturally fits with open source software
• Enterprise level groups are pushing the change (playing catchup with rest of the
industry)
• Size limitation of GitHub is forcing the use of binary repositories and security teams like
this.
SVN vs GitHub
9/30/2016 Copyright 4
SVN GitHub
Centralized Version Control System. Distributed and Offline capability
Branches are additional folders containing a copy of the
code base.
Branches have their own history and revision tree that
provides explicit information from where it was forked,
branch history management much easier.
Check out:
http://stackoverflow.com/questions/2471606/how-
and-or-why-is-merging-in-git-better-than-in-svn
The complete history of SVN is held in the central
repository. Thus the users should interact with the
SVN’s central repository to acquire the history about a
folder. It makes low performance to view the history
and changes.
In GitHub, its maintained in local and gives faster
performance to perform diff, view history, commit
changes, merge branches, switch branches and any
other revision of a file.
When an SVN branch is merged, the branch has to be
deleted to prevent an erroneous merge back to trunk in
future.
In GitHub, the branch history is maintained and is
completely traceable.
SVN vs GitHub
9/30/2016 Copyright 5
SVN GitHub
All resources and branches associated with a project
are often available in one place (URL of the repository).
If any system failure, hard to update the latest one.
Many backups will be available with many users. If
users frequently push and fetch changes with each
other this tends to be a small amount of loss.
Tags are copies of the branch from where they’re
created. They’re an entire folder to folder, file to file
copy of the repository.
Tags are symbolic references that point to “A single
commit” in the repository. It is like a snapshot of the
entire branch of a repository at a particular point in
time.
SVN Fans, Check out.. https://svnvsgit.com/
SVN to GitHub Migration
9/30/2016 Copyright 6
Prerequisites:
• Get read-only user access for SVN repository
• gitsvn utility is available in local environment
• Download svn-migration-scripts.jar from https://bitbucket.org/atlassian/svn-migration-
scripts/downloads
Get the target environment ready:
• Decide on GitHub Organization name, team name and members that will be part of the
team.
• Setup Organization.
• Setup Team and assign team members.
• Create repository to push the code from work area.
For full
documentation…
https://www.atlassian.co
m/git/tutorials/migrating
-prepare/
SVN to GitHub Migration
9/30/2016 Copyright 7
Migration:
• Extract the users information from SVN
java -jar svn-migration-scripts.jar authors <SVN Repo URL> > authors.txt
• Clone the SVN Repository
git svn clone --stdlayout --authors-file=authors.txt <SVN Repo URL> <GitHub Repo
Name>
• Create connectivity to remote repository
git remote add origin <GitHub Repo URL>
• Push the code from local to remote repo
git push –u origin master
• Converting remote braches to local repo
java -Dfile.encoding=utf-8 -jar svn-migration-scripts.jar clean-git –force
• Push all the changes to remote repo
git push –all
SVN to GitHub Migration
9/30/2016 Copyright 8
Note Worthy:
• We ran the scripts on Linux (Case sensitive file system), OS X has additional steps to
follow
• Most of SVN instances have local user ids defined (not SSO), authors file created was not
useful in migrating users. Granted separate user access on GitHub repos
• We ran against local Enterprise GitHub repos, not SAS offering
• Incremental migration is possible but we did not use that feature as client not
comfortable
• Scheduled weekly jobs for full migration during the transition phase
• Full migration takes time (300MB repo took around 8 – 10 hours)
• In some cases, we had to clone SVN repo to local file system (access issues)
• Some application groups preferred lift and shift mode than migration (small groups)
• Integrated Slack with GitHub that managers really loved it
• Extended Hygeia dashboard to provide manager level analytics
Lessons Learned…
9/30/2016 Copyright 9
• Don’t expect large enterprises have centrally managed source code repos (SVN hosted in
some sharable desktop under manager’s desk)
• Difficult to get developers to think distributed mind set (why should I clone entire repo?)
• Working closely with Enterprise GitHub helped in re-assuring clients
• Most of developers like to use IDE plugins than command line or web client
• Prepare to answer Backup, HA and DR related questions
• GitHub comes as virtual appliance not application – prepare yourself to deal with
infrastructure groups on deploying VMs into production.
AccuRev to GitHub Migration
9/30/2016 Copyright 10
Prerequisites:
• Install python 3.4, Git-Bash version 2.7.4 and AccuRev 6.1.1 .
• Make sure the paths to the AccuRev and git executables are correct for your machine,
and that git default configuration has been set.
• Clone the ac2git repo from the https://github.com/NavicoOS/ac2git
• Run python ac2git.py --help to see all the options. (strongly recommend you do this)
• Run python ac2git.py --example-config to get an example configuration.
• Follow the steps outlined in the How to use section.
AccuRev to GitHub Migration
9/30/2016 Copyright 11
Migration:
• Make an example config file:
• python ac2git.py --example-config
• Modify the generated file ac2git.config.example.xml, (there are plenty of notes in the file
and it is the time to run –help option if you have not done it from previous slide)
• Rename the ac2git.config.example.xml file as ac2git.config.xml
• Modify the configuration file and add the following information:
• Set accurev username & password
• Name of the depot. Map each Depot to single Git repository. Run the script for
each depot separately.
• Running the script for multiple depots to single folder override all depot
streams into same folder. Scripts fails when given multiple folders (So, always
run script for one depot at a time)
AccuRev to GitHub Migration
9/30/2016 Copyright 12
• Create an empty folder and provide the complete path in the config xml file.
• The folder must exist and should preferably be empty,
• There is no concept of having same folder name as stream name. Just needs an
empty folder where all the contents of stream will store.
• Start & end transactions which correspond to what you would enter in
the accurev hist command as the<time-spec> (number, the keyword highest or the
keyword now).
• If the start-transaction and the end-transaction are time-spec, script will fetch
the data and history only within this time period.
For example start-transaction = “2013-02-07 13:41:17” and end-transaction=
“2014-02-07 13:41:17”
AccuRev to GitHub Migration
9/30/2016 Copyright 13
• Use “highest” keyword instead of using “now” in end-transaction
• “now” will fetch data until the date and if there was some history deleted
from workspace or not promoted from other streams than “now”
keyword will not work.
• “highest” keyword always look for the latest and highest commit history
(preferred option)
• Stream must always have some history. If there is no history, script will not
work.
• User mapping from Accurev to GitHub. Hint: Run accurev show -fi users to see
a list of all the users
• If there are any duplicate or missing usernames then the script will not work.
Change ac2git.py to handle such scenarios:
In GetMissingUsers(config) method of ac2git.py code, comment last two lines
# if not found:
# missingList.append(user)
AccuRev to GitHub Migration
9/30/2016 Copyright 14
• Choose the preferred method for converting the streams.
• Recommend ”deep-hist” method for sparse streams (transactions that
have changed the stream contents are far apart).
• Recommend “diff” method for regular streams (when in doubt, just use
“deep-hist”.)
• Run the script
• python ac2git.py
• If you encounter any trouble. Run the script with the --help flag for more
options.
Thank You
Sridhar Peddinti
sridharp@newtglobal.com
Syed Abbas
syeda@newtglobal.com
15

Migrating To GitHub

  • 1.
    Migrating to GitHub DevOpsLive Meetup September 28, 2016 Sridhar Peddinti Syed Abbas
  • 2.
    Before we begin… 9/30/2016Copyright 2 THE FOLLOWING PRESENTATION IS SOLELY BASED ON OUR EXPERIENCE FOR THE FANS OF GITHUB REPOSITORY. MOST OF THE MATERIAL REFERENCED IN THE FOLLOWING SLIDES IS VALID FOR GITHUB OR ANY GIT BASED REPOSITORIES. ANY STATEMENTS THAT MIGHT DEMEAN SVN, ACCUREV OR OTHER REPOSITORIES IS PURELY COINCIDENTAL. WE LOVE ALL SOURCE CODE REPOSITORIES…LONG LIVE SOFTWARE DEVELOPMENT, MAY THERE NEVER BE TIME MACHINES GENERATE CODE. Issued in public interest www.newtglobal.com
  • 3.
    Motivators 9/30/2016 Copyright 3 •Managers and Architects like the Distributed and offline feature of GitHub • Enforces discipline on developers part • Technical savvy groups already using GitHub (Git Bridges) • Developers hate the change in the beginning but they are quick learners and like to adapt with rest of the industry. • Integration with other DevOps tool kits is a motivator • Naturally fits with open source software • Enterprise level groups are pushing the change (playing catchup with rest of the industry) • Size limitation of GitHub is forcing the use of binary repositories and security teams like this.
  • 4.
    SVN vs GitHub 9/30/2016Copyright 4 SVN GitHub Centralized Version Control System. Distributed and Offline capability Branches are additional folders containing a copy of the code base. Branches have their own history and revision tree that provides explicit information from where it was forked, branch history management much easier. Check out: http://stackoverflow.com/questions/2471606/how- and-or-why-is-merging-in-git-better-than-in-svn The complete history of SVN is held in the central repository. Thus the users should interact with the SVN’s central repository to acquire the history about a folder. It makes low performance to view the history and changes. In GitHub, its maintained in local and gives faster performance to perform diff, view history, commit changes, merge branches, switch branches and any other revision of a file. When an SVN branch is merged, the branch has to be deleted to prevent an erroneous merge back to trunk in future. In GitHub, the branch history is maintained and is completely traceable.
  • 5.
    SVN vs GitHub 9/30/2016Copyright 5 SVN GitHub All resources and branches associated with a project are often available in one place (URL of the repository). If any system failure, hard to update the latest one. Many backups will be available with many users. If users frequently push and fetch changes with each other this tends to be a small amount of loss. Tags are copies of the branch from where they’re created. They’re an entire folder to folder, file to file copy of the repository. Tags are symbolic references that point to “A single commit” in the repository. It is like a snapshot of the entire branch of a repository at a particular point in time. SVN Fans, Check out.. https://svnvsgit.com/
  • 6.
    SVN to GitHubMigration 9/30/2016 Copyright 6 Prerequisites: • Get read-only user access for SVN repository • gitsvn utility is available in local environment • Download svn-migration-scripts.jar from https://bitbucket.org/atlassian/svn-migration- scripts/downloads Get the target environment ready: • Decide on GitHub Organization name, team name and members that will be part of the team. • Setup Organization. • Setup Team and assign team members. • Create repository to push the code from work area. For full documentation… https://www.atlassian.co m/git/tutorials/migrating -prepare/
  • 7.
    SVN to GitHubMigration 9/30/2016 Copyright 7 Migration: • Extract the users information from SVN java -jar svn-migration-scripts.jar authors <SVN Repo URL> > authors.txt • Clone the SVN Repository git svn clone --stdlayout --authors-file=authors.txt <SVN Repo URL> <GitHub Repo Name> • Create connectivity to remote repository git remote add origin <GitHub Repo URL> • Push the code from local to remote repo git push –u origin master • Converting remote braches to local repo java -Dfile.encoding=utf-8 -jar svn-migration-scripts.jar clean-git –force • Push all the changes to remote repo git push –all
  • 8.
    SVN to GitHubMigration 9/30/2016 Copyright 8 Note Worthy: • We ran the scripts on Linux (Case sensitive file system), OS X has additional steps to follow • Most of SVN instances have local user ids defined (not SSO), authors file created was not useful in migrating users. Granted separate user access on GitHub repos • We ran against local Enterprise GitHub repos, not SAS offering • Incremental migration is possible but we did not use that feature as client not comfortable • Scheduled weekly jobs for full migration during the transition phase • Full migration takes time (300MB repo took around 8 – 10 hours) • In some cases, we had to clone SVN repo to local file system (access issues) • Some application groups preferred lift and shift mode than migration (small groups) • Integrated Slack with GitHub that managers really loved it • Extended Hygeia dashboard to provide manager level analytics
  • 9.
    Lessons Learned… 9/30/2016 Copyright9 • Don’t expect large enterprises have centrally managed source code repos (SVN hosted in some sharable desktop under manager’s desk) • Difficult to get developers to think distributed mind set (why should I clone entire repo?) • Working closely with Enterprise GitHub helped in re-assuring clients • Most of developers like to use IDE plugins than command line or web client • Prepare to answer Backup, HA and DR related questions • GitHub comes as virtual appliance not application – prepare yourself to deal with infrastructure groups on deploying VMs into production.
  • 10.
    AccuRev to GitHubMigration 9/30/2016 Copyright 10 Prerequisites: • Install python 3.4, Git-Bash version 2.7.4 and AccuRev 6.1.1 . • Make sure the paths to the AccuRev and git executables are correct for your machine, and that git default configuration has been set. • Clone the ac2git repo from the https://github.com/NavicoOS/ac2git • Run python ac2git.py --help to see all the options. (strongly recommend you do this) • Run python ac2git.py --example-config to get an example configuration. • Follow the steps outlined in the How to use section.
  • 11.
    AccuRev to GitHubMigration 9/30/2016 Copyright 11 Migration: • Make an example config file: • python ac2git.py --example-config • Modify the generated file ac2git.config.example.xml, (there are plenty of notes in the file and it is the time to run –help option if you have not done it from previous slide) • Rename the ac2git.config.example.xml file as ac2git.config.xml • Modify the configuration file and add the following information: • Set accurev username & password • Name of the depot. Map each Depot to single Git repository. Run the script for each depot separately. • Running the script for multiple depots to single folder override all depot streams into same folder. Scripts fails when given multiple folders (So, always run script for one depot at a time)
  • 12.
    AccuRev to GitHubMigration 9/30/2016 Copyright 12 • Create an empty folder and provide the complete path in the config xml file. • The folder must exist and should preferably be empty, • There is no concept of having same folder name as stream name. Just needs an empty folder where all the contents of stream will store. • Start & end transactions which correspond to what you would enter in the accurev hist command as the<time-spec> (number, the keyword highest or the keyword now). • If the start-transaction and the end-transaction are time-spec, script will fetch the data and history only within this time period. For example start-transaction = “2013-02-07 13:41:17” and end-transaction= “2014-02-07 13:41:17”
  • 13.
    AccuRev to GitHubMigration 9/30/2016 Copyright 13 • Use “highest” keyword instead of using “now” in end-transaction • “now” will fetch data until the date and if there was some history deleted from workspace or not promoted from other streams than “now” keyword will not work. • “highest” keyword always look for the latest and highest commit history (preferred option) • Stream must always have some history. If there is no history, script will not work. • User mapping from Accurev to GitHub. Hint: Run accurev show -fi users to see a list of all the users • If there are any duplicate or missing usernames then the script will not work. Change ac2git.py to handle such scenarios: In GetMissingUsers(config) method of ac2git.py code, comment last two lines # if not found: # missingList.append(user)
  • 14.
    AccuRev to GitHubMigration 9/30/2016 Copyright 14 • Choose the preferred method for converting the streams. • Recommend ”deep-hist” method for sparse streams (transactions that have changed the stream contents are far apart). • Recommend “diff” method for regular streams (when in doubt, just use “deep-hist”.) • Run the script • python ac2git.py • If you encounter any trouble. Run the script with the --help flag for more options.
  • 15.