I recently defined Continuous Integration as the practice of merging development work with a Master/Trunk/Mainline branch constantly so that you can test changes, and test that changes work with other changes, this is the "as early as possible" integration methodology. The idea here is to test your code as often as possible to catch issues early (Continuous Delivery vs Continuous Deployment vs Continuous Integration)
The argument goes, that a Continuous Integration process is the process of constantly integrating all development work across the entire project into Mainline to detect issues early on in a code’s lifecycle. I argue that when utilizing Continuous Deployment, this is detrimental and the proper way is to integrate only Production ready code to Mainline, while merging Mainline back onto the isolated development work. Recently, I explained this concept to another developer and their response: “I have never thought about merging backwards from master to branches in order to run test, amazing”. Continuous Deployment is not achievable using traditional CI methodologies as described by ThoughtWorks and others because the bottlenecks will prevent the flow of code from developer to Production, but the simple notion of merging backwards to run tests in order to see a view of Mainline before you integrate upwards, will allow you to achieve Continuous Deployment
Oh and lets not forget broken builds - broken builds prevent anyone from being able to reliably take code from Mainline or move code to Production from Mainline. Centralized CI has a way to “fix” this, rule #2: never break the build, if you do you must fix it, and not leave until you do. OK, so nevermind the insanity of the first part of this rule, since the only way to ensure satisfying it is by never committing: I cannot control how my code integrates with other unknown code.. And sure, if I break something, I understand that I must fix it. But how do we know it was my code that broke the build and not yours, we don’t. But assume it is me who broke it, now everyone is waiting for me to fix it. I have now become a bottleneck and no code can move past Mainline, shoots, I guess Friday night beers are not in my future. And the pipeline from developer to Production has just halted dead in its tracks, no code can be Continuously Deployed. Some places call this a lesson (http://www.hanselman.com/blog/FirstRuleOfSoftwareDevelopment.aspx) and assume that you will learn from this ordeal, I think you will fear it, yes, but to work in fear . . . I prefer not to. I prefer to have systems that do require me to stay late to keep Mainline stable unduly, but rather a system that always ensures Mainline is stable, before and after my code is merged.
Distributed CI deals with this with a “mergeback” from Mainline to your developer branch - ensuring that you have a clean build after running unit tests on the branch, then merging up to Mainline. Otherwise, having failed a developer branch unit test - do not merge to Mainline, do not become the bottleneck, go out for beers, rest well, and fix it tomorrow. After the developer branch is merged to Mainline, do not run the unit tests again - the test suite was already run against a copy of Mainline in your development branch, already knowing that all tests have passed, the code is deployed right out to Production. Then Mainline is merged backwards to other developers’ branches, and unit tests are run individually on each developer branch. If any tests fail - the owner of that branch must fix the issue.
Rule #3 no feature branches. Wait what? Hold on, I like my feature branches. But Jez did just say: “But you can’t say you’re doing Continuous Integration and be doing feature branching. It’s just not possible by definition (while waving hands)”, didn’t he? Looking at the definition of CI from Martin Fowler of ThoughtWorks:
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible.
I do not actually see anywhere that says CI cannot have feature branches, just that all developers work must integrate frequently - not even daily. Well in Jez’s Centralized CI - feature branches are a no go - you must integrate daily ALL changes to a centralized Mainline. But hold on - this code can’t be integrated into Mainline, its weeks away from delivery. In distributed CI - a feature branch is nothing more than a developer’s branch. Feature branch away. Mainline will be integrated back into the Feature branch after any Production deploy and all new stable code will be integrated and tested immediately.
Hmmm, sounds like feature branching and CI are not mutually exclusive - nice. But Jez was very clear on this. Yes, his worry is about long running code not being integrated with. OK - so you want a true feature branch and not integrate up to Mainline for longer than today. Well you can. Since you are always integrating backwards from Mainline. But you must continually do this. Once we are ready to move the feature branch into Mainline - we already have a snapshot of what Mainline would look like, the feature branch is Mainline plus all new code of the feature branch, and it has been Continually Integrated with since creation every time Mainline is updated. If two developers want to integrate with each other, then they are able to in the Distributed CI World, they can actually be moved behind another integration point, where their work is merged up to it and then to Mainline. This is basically utilizing the ability to create a distributed network of developers working off of various localized Mainlines that merge up to a Master Mainline.
Yes. if you are implementing Distributed CI and CD together, you will always be integrating your developer code with the latest stable release to Production. If an integration test fails after the release to Production, it will fail for a developer’s branch, not everyone, so we know the integration failure is isolated to that developer’s branch and the new code; the developer whose failed branch must attend to it. In Centralized CI, the developers must discuss and work out this issue to see who is at fault. CD demands that you deliver code to Production often. In Distributed CI, anytime you deliver code to Production, the Mainline branch is merged back to developer branches and tested (thank goodness for automation). What does this mean, lets look at the life cycle of a commit a little closer:
The top diagram shows a Centralized CI process, where the developer merges into a Mainline integrated with other developers commits before running Unit Tests. Mainline cannot necessarily be deployed since some work from developers may not be Production ready. The lower diagram show a Distributed CI process where the developer merges Mainline backwards, then integrates up to Mainline, then pushes on through to Production.
If you are performing CD, you are releasing very often, each stable commit is still tested with every developer’s development work. Except in distributed CI - its tested individually, so detection of issues is easier - the amount of conflicting code is less, and isolated - to a developer’s branch. Distributed CI is a more thorough form of CI when implemented with CD - which attempts to break down processes into smaller automated pieces. Centralized CI is suited for iteration release planning, but distributed CI will work just as well for this and better. If you are doing centralized CI - then you are not doing Continuous Deployment. Dstributed CI removes all the bottlenecks and constraints placed on the workflow from developer to production that centralized CI requires.
So why all this confusion around CI and the artificial constraints put on it? Clearly, the centralized VCS is the basis for these constraints. Basically, Distributed CI is treating each developer branch like it is Mainline and testing there instead of up to a centralized Mainline. Perhaps its because CI grew up in a time of non-distributed VCS and has not modernized since. Now is the time to utilize the advantage of a distributed VCS. Unleash those feature branches, and commit code now, merge to Mainline to see it in Production moments later - yes, moments later, every time.
Argument
The argument goes, that a Continuous Integration process is the process of constantly integrating all development work across the entire project into Mainline to detect issues early on in a code’s lifecycle. I argue that when utilizing Continuous Deployment, this is detrimental and the proper way is to integrate only Production ready code to Mainline, while merging Mainline back onto the isolated development work. Recently, I explained this concept to another developer and their response: “I have never thought about merging backwards from master to branches in order to run test, amazing”. Continuous Deployment is not achievable using traditional CI methodologies as described by ThoughtWorks and others because the bottlenecks will prevent the flow of code from developer to Production, but the simple notion of merging backwards to run tests in order to see a view of Mainline before you integrate upwards, will allow you to achieve Continuous Deployment
- See more at: http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpuf
Argument
The argument goes, that a Continuous Integration process is the process of constantly integrating all development work across the entire project into Mainline to detect issues early on in a code’s lifecycle. I argue that when utilizing Continuous Deployment, this is detrimental and the proper way is to integrate only Production ready code to Mainline, while merging Mainline back onto the isolated development work. Recently, I explained this concept to another developer and their response: “I have never thought about merging backwards from master to branches in order to run test, amazing”. Continuous Deployment is not achievable using traditional CI methodologies as described by ThoughtWorks and others because the bottlenecks will prevent the flow of code from developer to Production, but the simple notion of merging backwards to run tests in order to see a view of Mainline before you integrate upwards, will allow you to achieve Continuous Deployment
Broken Builds
Oh and lets not forget broken builds - broken builds prevent anyone from being able to reliably take code from Mainline or move code to Production from Mainline. Centralized CI has a way to “fix” this, rule #2: never break the build, if you do you must fix it, and not leave until you do. OK, so nevermind the insanity of the first part of this rule, since the only way to ensure satisfying it is by never committing: I cannot control how my code integrates with other unknown code.. And sure, if I break something, I understand that I must fix it. But how do we know it was my code that broke the build and not yours, we don’t. But assume it is me who broke it, now everyone is waiting for me to fix it. I have now become a bottleneck and no code can move past Mainline, shoots, I guess Friday night beers are not in my future. And the pipeline from developer to Production has just halted dead in its tracks, no code can be Continuously Deployed. Some places call this a lesson (http://www.hanselman.com/blog/FirstRuleOfSoftwareDevelopment.aspx) and assume that you will learn from this ordeal, I think you will fear it, yes, but to work in fear . . . I prefer not to. I prefer to have systems that do require me to stay late to keep Mainline stable unduly, but rather a system that always ensures Mainline is stable, before and after my code is merged.
Distributed CI deals with this with a “mergeback” from Mainline to your developer branch - ensuring that you have a clean build after running unit tests on the branch, then merging up to Mainline. Otherwise, having failed a developer branch unit test - do not merge to Mainline, do not become the bottleneck, go out for beers, rest well, and fix it tomorrow. After the developer branch is merged to Mainline, do not run the unit tests again - the test suite was already run against a copy of Mainline in your development branch, already knowing that all tests have passed, the code is deployed right out to Production. Then Mainline is merged backwards to other developers’ branches, and unit tests are run individually on each developer branch. If any tests fail - the owner of that branch must fix the issue.
Feature Branches
Rule #3 no feature branches. Wait what? Hold on, I like my feature branches. But Jez did just say: “But you can’t say you’re doing Continuous Integration and be doing feature branching. It’s just not possible by definition (while waving hands)”, didn’t he? Looking at the definition of CI from Martin Fowler of ThoughtWorks:
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible.
I do not actually see anywhere that says CI cannot have feature branches, just that all developers work must integrate frequently - not even daily. Well in Jez’s Centralized CI - feature branches are a no go - you must integrate daily ALL changes to a centralized Mainline. But hold on - this code can’t be integrated into Mainline, its weeks away from delivery. In distributed CI - a feature branch is nothing more than a developer’s branch. Feature branch away. Mainline will be integrated back into the Feature branch after any Production deploy and all new stable code will be integrated and tested immediately.
Hmmm, sounds like feature branching and CI are not mutually exclusive - nice. But Jez was very clear on this. Yes, his worry is about long running code not being integrated with. OK - so you want a true feature branch and not integrate up to Mainline for longer than today. Well you can. Since you are always integrating backwards from Mainline. But you must continually do this. Once we are ready to move the feature branch into Mainline - we already have a snapshot of what Mainline would look like, the feature branch is Mainline plus all new code of the feature branch, and it has been Continually Integrated with since creation every time Mainline is updated. If two developers want to integrate with each other, then they are able to in the Distributed CI World, they can actually be moved behind another integration point, where their work is merged up to it and then to Mainline. This is basically utilizing the ability to create a distributed network of developers working off of various localized Mainlines that merge up to a Master Mainline.
s Distributed CI really CI?
Yes. if you are implementing Distributed CI and CD
together, you will always be integrating your developer code with the
latest stable release to Production. If an integration test fails after
the release to Production, it will fail for a developer’s branch, not
everyone, so we know the integration failure is isolated to that
developer’s branch and the new code; the developer whose failed branch
must attend to it. In Centralized CI, the developers must discuss and
work out this issue to see who is at fault. CD demands that you deliver
code to Production often. In Distributed CI, anytime you deliver code
to Production, the Mainline branch is merged back to developer branches
and tested (thank goodness for automation). What does this mean, lets
look at the life cycle of a commit a little closer:
- See
more at:
http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpufIs Distributed CI really CI?
Yes. if you are implementing Distributed CI and CD together, you will always be integrating your developer code with the latest stable release to Production. If an integration test fails after the release to Production, it will fail for a developer’s branch, not everyone, so we know the integration failure is isolated to that developer’s branch and the new code; the developer whose failed branch must attend to it. In Centralized CI, the developers must discuss and work out this issue to see who is at fault. CD demands that you deliver code to Production often. In Distributed CI, anytime you deliver code to Production, the Mainline branch is merged back to developer branches and tested (thank goodness for automation). What does this mean, lets look at the life cycle of a commit a little closer:
s Distributed CI really CI?
Yes. if you are implementing Distributed CI and CD
together, you will always be integrating your developer code with the
latest stable release to Production. If an integration test fails after
the release to Production, it will fail for a developer’s branch, not
everyone, so we know the integration failure is isolated to that
developer’s branch and the new code; the developer whose failed branch
must attend to it. In Centralized CI, the developers must discuss and
work out this issue to see who is at fault. CD demands that you deliver
code to Production often. In Distributed CI, anytime you deliver code
to Production, the Mainline branch is merged back to developer branches
and tested (thank goodness for automation). What does this mean, lets
look at the life cycle of a commit a little closer:
- See
more at:
http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpufs Distributed CI really CI?
Yes. if you are implementing Distributed CI and CD
together, you will always be integrating your developer code with the
latest stable release to Production. If an integration test fails after
the release to Production, it will fail for a developer’s branch, not
everyone, so we know the integration failure is isolated to that
developer’s branch and the new code; the developer whose failed branch
must attend to it. In Centralized CI, the developers must discuss and
work out this issue to see who is at fault. CD demands that you deliver
code to Production often. In Distributed CI, anytime you deliver code
to Production, the Mainline branch is merged back to developer branches
and tested (thank goodness for automation). What does this mean, lets
look at the life cycle of a commit a little closer:
- See
more at:
http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpufIf you are performing CD, you are releasing very often, each stable commit is still tested with every developer’s development work. Except in distributed CI - its tested individually, so detection of issues is easier - the amount of conflicting code is less, and isolated - to a developer’s branch. Distributed CI is a more thorough form of CI when implemented with CD - which attempts to break down processes into smaller automated pieces. Centralized CI is suited for iteration release planning, but distributed CI will work just as well for this and better. If you are doing centralized CI - then you are not doing Continuous Deployment. Dstributed CI removes all the bottlenecks and constraints placed on the workflow from developer to production that centralized CI requires.
So why all this confusion around CI and the artificial constraints put on it? Clearly, the centralized VCS is the basis for these constraints. Basically, Distributed CI is treating each developer branch like it is Mainline and testing there instead of up to a centralized Mainline. Perhaps its because CI grew up in a time of non-distributed VCS and has not modernized since. Now is the time to utilize the advantage of a distributed VCS. Unleash those feature branches, and commit code now, merge to Mainline to see it in Production moments later - yes, moments later, every time.
I recently defined Continuous Integration as the
practice of merging development work with a Master/Trunk/Mainline
branch constantly so that you can test changes, and test that changes
work with other changes, this is the "as early as possible" integration
methodology. The idea here is to test your code as often as possible to
catch issues early (Continuous Delivery vs Continuous Deployment vs Continuous Integration)
Watching a presentation by Jez Humble of
ThoughtWorks, who defines Continuous Integration (CI) in relationship to
Continuous Delivery, I realized that my definition was in direct
opposition to 2 minutes of his presentation: http://www.youtube.com/watch?v=IBghnXBz3_w&feature=youtu.be&t=10m
But why is Jez so adamant about these points, whereas I feel I am
doing CI without everyone committing to a Mainline daily, with having
Feature branches and often with working locally, making commits without
having ALL integrated tests running. Can I be? I say yes, and the
reason is because of where the integration points are and what kind of
code is moving forward to Mainline - unstable code or stable code. These
differences keep a clean stable Mainline which in turns gives the
ability to deploy code to Production anytime. Jez’s definition of CI
(Centralized CI) causes bottlenecks and is in direct contradiction with
the process of Continuous Deployment (CD), whereas Distributed CI is
able to remove these barriers while still giving confidence good code is
moving to Production.
- See more at:
http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpuf
I recently defined Continuous Integration as the
practice of merging development work with a Master/Trunk/Mainline
branch constantly so that you can test changes, and test that changes
work with other changes, this is the "as early as possible" integration
methodology. The idea here is to test your code as often as possible to
catch issues early (Continuous Delivery vs Continuous Deployment vs Continuous Integration)
Watching a presentation by Jez Humble of
ThoughtWorks, who defines Continuous Integration (CI) in relationship to
Continuous Delivery, I realized that my definition was in direct
opposition to 2 minutes of his presentation: http://www.youtube.com/watch?v=IBghnXBz3_w&feature=youtu.be&t=10m
But why is Jez so adamant about these points, whereas I feel I am
doing CI without everyone committing to a Mainline daily, with having
Feature branches and often with working locally, making commits without
having ALL integrated tests running. Can I be? I say yes, and the
reason is because of where the integration points are and what kind of
code is moving forward to Mainline - unstable code or stable code. These
differences keep a clean stable Mainline which in turns gives the
ability to deploy code to Production anytime. Jez’s definition of CI
(Centralized CI) causes bottlenecks and is in direct contradiction with
the process of Continuous Deployment (CD), whereas Distributed CI is
able to remove these barriers while still giving confidence good code is
moving to Production.
- See more at:
http://blog.assembla.com/assemblablog/tabid/12618/bid/96937/Distributed-Continuous-Integration-Keep-the-Mainline-Clean.aspx#sthash.qgR2lFph.dpuf
No comments:
Post a Comment