I'm using Go to build projects that are hosted on Git repositories. These git repositories have submodules in.
It seems that about every minute when polling, the Go server will wipe the submodules and re-clone them, which is causing an excessive load.
Why would it be doing this? It can surely keep syncing the existing submodule clone.
For that matter, why does it even need to checkout the submodules? The changes needed to be detected are in the parent repository, as submodules checkout are essentially commit hashes stored in commits to the parent repository.
The Go server log frequently has the following log line in (about every 2 minutes):
WARN [materialUpdateThread] MaterialUpdateService:60 [Material Update] Skipping update of material GitMaterial ... which has been in-progress since [about 1 minute before]
I'm sure that a git hook initiated build is more ideal, but I'd prefer to poll if it is as efficient as it should be.
The Go server version is: 2.4.0
The git version is 1.7.4.1
Kind Regards
Andy
Comment
Hi Andy,
Go server performs material check every one minute, and uses a UUID as destination directory for 'git clone' the first time. Second time onwards Go does not perform a clone, but insteed uses existing clone and performs fetch over it.
The wipe the dir and perform fresh clone flow will only engage when there is already a different git repo(repo with a different url) cloned to the destination directory.
This can happen in a few scenarios, for instance, if you have a UUID conflict(and new repo trying to use UUID which is already used by another repo) or if destination is not writable by the user that Go is running as etc.
Can you please enable debug logging, that is, modify /etc/go/log4j.properties or equivallent for your installation environment to change the line 'log4j.logger.com.thoughtworks.cruise=INFO' to 'log4j.logger.com.thoughtworks.cruise=DEBUG' and follow it with a server restart.
Please wait for the server to run for some time(a few hours) after doing this and send us the server log for this period.
Regards,
Janmejay
Hi Janmejay
I've got logs for an hour of debug activity. How would you like me to pass them on?
Regarding what I meant over cloning, Go doesn't directly call git clone, but, as described in http://community.thoughtworks.com/posts/45dcafc35c, goes through each submodule and deletes the data, then runs
git submodule update --init
which internally runs "git clone -n {submodule url} {submodule dir}" for each submodule (as Go deleted the original)
This effectively would cause possibly 100's of MB of data to be transfered over the network or filesystem every minute, increasing as the submodules gained history.
If the submodules are removed from the repo (and so no longer being cloned), the polling is causing barely any load due to incremental fetches to the repository.
Kind Regards
Andy