Automating the maintenance of the ChangeLog file

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Automating the maintenance of the ChangeLog file

Zack Weinberg-2
I would like to separate the discussion of *how we could automate
maintenance of the ChangeLog file* from the discussion of *whether we
should keep writing ChangeLog entries, by hand or with some level of
machine assistance*.  The value of writing a traditional ChangeLog
entry for each commit is contentious even within this project, and any
change would require arguing with GNU upper management who may be even
more attached to them than some of us are.  Maintaining the ChangeLog
file by hand, however, is 100% make-work that makes landing patches
more difficult than it should be, and we could automate it in short
order with one policy change that we can make ourselves and should be
far less contentious, plus some code.

Specifically, I propose:  Effective immediately, project policy
requires the Git commit message for each commit to end with the full
text of a traditional ChangeLog entry for that commit.  Most everyone
is already doing this most of the time, so the policy change just
upgrades it to a requirement.  As soon as someone has time to code up
the automation, we stop maintaining the ChangeLog file as a checked-in
file in the repo; instead, the release scripts generate a ChangeLog
file from the Git commit history, going back to a tag corresponding to
the last manual update of the old file (which we move to ChangeLog.old
-- it's already over a megabyte, so it's time anyway) and inject it
into the tarball.  I recall someone saying a few months ago that the
necessary script already exists, and if not it should be easy to
write.

(I regret I will not have time to code up the automation myself in the
near future.)

zw
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Szabolcs Nagy-2
On 21/11/18 14:13, Zack Weinberg wrote:
> I would like to separate the discussion of *how we could automate
> maintenance of the ChangeLog file* from the discussion of *whether we
> should keep writing ChangeLog entries, by hand or with some level of
> machine assistance*.  The value of writing a traditional ChangeLog
> entry for each commit is contentious even within this project, and any
> change would require arguing with GNU upper management who may be even

why do you say it's contentious?

i don't see anybody arguing for manually writing changelog.

(there seems to be agreement that the only point of the ChangeLog is
to please RMS, which we can do by generating it when creating the
release tarball, your comment that it's useful for catching bugs by
a submitter are not relevant to the contribution process: submitters
are always free to do whatever exercise they deem useful for catching
more bugs, such as running build-many-glibcs.py for all targets, but
we don't require that.)

> more attached to them than some of us are.  Maintaining the ChangeLog
> file by hand, however, is 100% make-work that makes landing patches
> more difficult than it should be, and we could automate it in short

writing the changelog entry is what takes time,
copying that boilerplate around does not matter much,
so i don't see a significant gain with that approach.

> order with one policy change that we can make ourselves and should be
> far less contentious, plus some code.
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Gabriel F. T. Gomes-2
In reply to this post by Zack Weinberg-2
On Wed, 21 Nov 2018, Zack Weinberg wrote:

>Specifically, I propose:  Effective immediately, project policy
>requires the Git commit message for each commit to end with the full
>text of a traditional ChangeLog entry for that commit.  Most everyone
>is already doing this most of the time, so the policy change just
>upgrades it to a requirement.

I only add a ChangeLog entry to the commit message because every commit
needs one, and our project policy states that submitted patches should not
include changes to the ChangeLog file in the diff itself, but before the
diff (see the second paragraph of section 12 in the contribution
checklist [1]).  So, maybe this is already a requirement?

On the other hand, what I would like to see removed, instead, is the
requirement to write a ChangeLog entry, in the first place.  I find it a
lot more useful to read detailed explanations of why something has
changed, rather than what bits have changed in each file.  Several people
write very detailed commit messages, which even explain how something
works, before explaining why it needs to change and how it was changed
(recent example [2]).

Some people do give detailed explanations of how and why something has
changed in ChangeLog format, and such messages in ChangeLog format are
very useful, indeed.  However, they are the exception rather than the norm.

Net, I don't think the ChangeLog format is the responsible for people
writing better messages.  People write long, detailed messages because
they can and want to, be it in ChangeLog format or free-form sentences and
paragraphs.

Removing the requirement to have a ChangeLog-formatted section in the
commit message (which I'm in favor of) would totally spoil your proposal
to generate a ChangeLog file from git commit messages, though.  :(

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist#Properly_Formatted_GNU_ChangeLog

[2] https://sourceware.org/ml/libc-alpha/2018-11/msg00508.html
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Zack Weinberg-2
In reply to this post by Szabolcs Nagy-2
On Wed, Nov 21, 2018 at 10:43 AM Szabolcs Nagy <[hidden email]> wrote:

>
> On 21/11/18 14:13, Zack Weinberg wrote:
> > I would like to separate the discussion of *how we could automate
> > maintenance of the ChangeLog file* from the discussion of *whether we
> > should keep writing ChangeLog entries, by hand or with some level of
> > machine assistance*.  The value of writing a traditional ChangeLog
> > entry for each commit is contentious even within this project, and any
> > change would require arguing with GNU upper management who may be even
>
> why do you say it's contentious?
> i don't see anybody arguing for manually writing changelog.

Andreas Schwab has, in the other thread.

> (there seems to be agreement that the only point of the ChangeLog is
> to please RMS, which we can do by generating it when creating the
> release tarball)

That's not what it looks like to me.

> (your comment that it's useful for catching bugs by
> a submitter are not relevant to the contribution process)

True.

> writing the changelog entry is what takes time,
> copying that boilerplate around does not matter much,
> so i don't see a significant gain with that approach.

Perhaps because I find writing the changelog entry to be a useful
bug-catching exercise, I don't mind the time it takes.  But I do very
much mind having to remember to rebase an approved patch series, edit
the actual changelog file in each commit, copy and paste the text out
of the commit message.  Without that, the process of pushing an
approved patch series would be a string of commands that I can rattle
off without thinking about them.  With it, it takes multiple manual
actions for each patch in the series and it's easy to make mistakes.
That's what I care about eliminating, and I'm frustrated that a
GNU-level policy change that may never be accomplished is blocking it.

zw
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Zack Weinberg-2
In reply to this post by Gabriel F. T. Gomes-2
On Wed, Nov 21, 2018 at 10:44 AM Gabriel F. T. Gomes
<[hidden email]> wrote:

> On Wed, 21 Nov 2018, Zack Weinberg wrote:
>
> >Specifically, I propose:  Effective immediately, project policy
> >requires the Git commit message for each commit to end with the full
> >text of a traditional ChangeLog entry for that commit.  Most everyone
> >is already doing this most of the time, so the policy change just
> >upgrades it to a requirement.
>
> I only add a ChangeLog entry to the commit message because every commit
> needs one, and our project policy states that submitted patches should not
> include changes to the ChangeLog file in the diff itself, but before the
> diff (see the second paragraph of section 12 in the contribution
> checklist [1]).  So, maybe this is already a requirement?

Not quite: I think we would need to update the _committer_ checklist
(https://sourceware.org/glibc/wiki/Committer%20checklist) to state
that the changelog text must be included in the commit message.

Glancing at "git log" output, it appears to me that some committers
(e.g. Joseph, H.J) always do this and some (e.g. Florian, Andreas)
don't.

> On the other hand, what I would like to see removed, instead, is the
> requirement to write a ChangeLog entry, in the first place.

This is the discussion that I specifically _don't_ want to have in
this thread, because it involves a GNU-level policy change that I fear
may never actually happen, and I do not want that to block process
improvements that don't technically depend on it.  See my reply to
Szabolcs for why I think not having to update the ChangeLog _file_ on
each commit is a valuable process improvement in itself.

zw
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Rafael Avila de Espindola
In reply to this post by Zack Weinberg-2
"Zack Weinberg" <[hidden email]> writes:

> I would like to separate the discussion of *how we could automate
> maintenance of the ChangeLog file* from the discussion of *whether we
> should keep writing ChangeLog entries, by hand or with some level of
> machine assistance*.

For what it is worth, I have worked on both gnu and llvm projects. IMHO
the big day to day difference is not even the new and shinny design, it
is the lesser extent of the self inflicted pain. Not having to write
boilerplate in an arcane format would go a long way in reducing the pain
in gnu projects.

Cheers,
Rafael

Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Jeff Law
In reply to this post by Zack Weinberg-2
On 11/21/18 8:57 AM, Zack Weinberg wrote:

> On Wed, Nov 21, 2018 at 10:43 AM Szabolcs Nagy <[hidden email]> wrote:
>>
>> On 21/11/18 14:13, Zack Weinberg wrote:
>>> I would like to separate the discussion of *how we could automate
>>> maintenance of the ChangeLog file* from the discussion of *whether we
>>> should keep writing ChangeLog entries, by hand or with some level of
>>> machine assistance*.  The value of writing a traditional ChangeLog
>>> entry for each commit is contentious even within this project, and any
>>> change would require arguing with GNU upper management who may be even
>>
>> why do you say it's contentious?
>> i don't see anybody arguing for manually writing changelog.
>
> Andreas Schwab has, in the other thread.
>
>> (there seems to be agreement that the only point of the ChangeLog is
>> to please RMS, which we can do by generating it when creating the
>> release tarball)
>
> That's not what it looks like to me.
>
>> (your comment that it's useful for catching bugs by
>> a submitter are not relevant to the contribution process)
>
> True.
>
>> writing the changelog entry is what takes time,
>> copying that boilerplate around does not matter much,
>> so i don't see a significant gain with that approach.
>
> Perhaps because I find writing the changelog entry to be a useful
> bug-catching exercise, I don't mind the time it takes.  But I do very
> much mind having to remember to rebase an approved patch series, edit
> the actual changelog file in each commit, copy and paste the text out
> of the commit message.  Without that, the process of pushing an
> approved patch series would be a string of commands that I can rattle
> off without thinking about them.  With it, it takes multiple manual
> actions for each patch in the series and it's easy to make mistakes.
> That's what I care about eliminating, and I'm frustrated that a
> GNU-level policy change that may never be accomplished is blocking it.
Agreed 100%.  This is precisely where I want to see GCC go in the short
term as well.

Like you, I find writing the ChangeLog useful in that I'm forced to look
at my change again and I regularly see something that I think ought to
be fixed and run through another test cycle.

But the insanity of actually putting it into a ChangeLog file, dealing
with rebases/conflicts as a result is just silly, particularly when we
could just include the data in the commit message and have tools to
extract them from commit messages for those who find reading ChangeLogs
useful (such as myself).

I haven't really got any traction on that for GCC though.

jeff

>
> zw
>

Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Siddhesh Poyarekar-8
In reply to this post by Zack Weinberg-2
On 21/11/18 7:43 PM, Zack Weinberg wrote:
> Specifically, I propose:  Effective immediately, project policy
> requires the Git commit message for each commit to end with the full
> text of a traditional ChangeLog entry for that commit.  Most everyone

FWIW, this was my original proposal with a few more tags (Reviewed-By,
Signed-off-by, etc.) because it potentially has far less friction than
dropping the ChangeLog completely.

In fact it remains a fallback plan for me if the review for the current
proposed script takes too long.  I picked up 2.29 release management
with the express intention of getting rid of the ChangeLog file, so I
want to make sure that it happens in the next couple of months :)

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Siddhesh Poyarekar-8
In reply to this post by Jeff Law
On 21/11/18 9:56 PM, Jeff Law wrote:
> But the insanity of actually putting it into a ChangeLog file, dealing
> with rebases/conflicts as a result is just silly, particularly when we
> could just include the data in the commit message and have tools to
> extract them from commit messages for those who find reading ChangeLogs
> useful (such as myself).

In addition to just the inconvenience of rebases in our own trees, the
requirement of maintaining the file prevents us from using any sort of
patch review tools because none of our commits correspond to any patches
we send.  The patchwork deployment for glibc was effectively useless
because of this; having to manually close out patches instead of them
auto-closing on triggers is a major inconvenience.

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Joseph Myers
In reply to this post by Zack Weinberg-2
On Wed, 21 Nov 2018, Zack Weinberg wrote:

> any change would require arguing with GNU upper management who may be
> even more attached to them than some of us are.

We *have* agreement from GNU upper management to allow not using the
existing ChangeLog format (and, thus, have no entity-level descriptions of
the individual pieces of the changes unless the patch author thinks such a
description is useful in a particular case), *if* the list of changed
entities can be generated from the commits reliably enough.  (I failed to
persuade RMS that the list of named entities for a commit is not
significantly useful because in practice the problem you tend to have in
development is the inverse one - given an entity you're looking at for
some reason, identify past changes to it - which git tools already have
various ways to address.)

https://lists.gnu.org/archive/html/bug-standards/2018-05/msg00003.html
https://lists.gnu.org/archive/html/bug-standards/2018-05/msg00011.html

--
Joseph S. Myers
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Zack Weinberg-2
On Wed, Nov 21, 2018 at 12:11 PM Joseph Myers <[hidden email]> wrote:
> On Wed, 21 Nov 2018, Zack Weinberg wrote:
>
> > any change would require arguing with GNU upper management who may be
> > even more attached to them than some of us are.
>
> We *have* agreement from GNU upper management to allow not using the
> existing ChangeLog format ...

That's good to know but I still think it's worth pushing forward
independently on process improvements that don't depend on a format
change.

zw
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Siddhesh Poyarekar-8
On 21/11/18 10:55 PM, Zack Weinberg wrote:
> That's good to know but I still think it's worth pushing forward
> independently on process improvements that don't depend on a format
> change.

It's worthwhile to give this a couple of weeks IMO, because enforcing
process changes can be hard to implement consistently and I'd rather not
write ChangeLogs.  If there's no sign of agreement on the proposed
script by mid-December, I'll make a proposal for workflow changes and
scripting to make it happen.  They can be exceptions to the freeze too
since they don't affect functionality.  That way we have more than a
month and a half to hash it out.

But I am optimistic about the automatic ChangeLog generation...

Siddhesh
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Jeff Law
In reply to this post by Siddhesh Poyarekar-8
On 11/21/18 10:00 AM, Siddhesh Poyarekar wrote:

> On 21/11/18 9:56 PM, Jeff Law wrote:
>> But the insanity of actually putting it into a ChangeLog file, dealing
>> with rebases/conflicts as a result is just silly, particularly when we
>> could just include the data in the commit message and have tools to
>> extract them from commit messages for those who find reading ChangeLogs
>> useful (such as myself).
>
> In addition to just the inconvenience of rebases in our own trees, the
> requirement of maintaining the file prevents us from using any sort of
> patch review tools because none of our commits correspond to any patches
> we send.  The patchwork deployment for glibc was effectively useless
> because of this; having to manually close out patches instead of them
> auto-closing on triggers is a major inconvenience.
Yup.  I'd like to see GCC moving to a model where there's a pull request
which is ack'd and the right things "just happen".  One less round trip
for a goodly number of patches would be helpful to everyone I suspect.

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Florian Weimer-5
In reply to this post by Zack Weinberg-2
* Zack Weinberg:

> Specifically, I propose:  Effective immediately, project policy
> requires the Git commit message for each commit to end with the full
> text of a traditional ChangeLog entry for that commit.

How can I see what is part of the proposed commit (message) and what is
not during patch review?

The patch part is clear, but the commit message and the ChangeLog entry
that is supposedly contained in that is unclear.

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Zack Weinberg-2
On Tue, Nov 27, 2018 at 2:29 AM Florian Weimer <[hidden email]> wrote:
> * Zack Weinberg:
> > Specifically, I propose:  Effective immediately, project policy
> > requires the Git commit message for each commit to end with the full
> > text of a traditional ChangeLog entry for that commit.
>
> How can I see what is part of the proposed commit (message) and what is
> not during patch review?

If we take `git format-patch` as a baseline, that means the subject
line is the first line of the commit message, and everything in the
email body up to the first line of the actual diff is the commit
message. But sometimes one wants to say things in the email that
aren't going to go into the commit message.  I don't have a good
suggestion for how to handle that, but if we came up with a convention
for separating the commit message from additional notes to reviewers,
I don't think it would be a problem getting everyone to follow it.

zw
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

J William Piggott


On Tue, 27 Nov 2018, Zack Weinberg wrote:

> On Tue, Nov 27, 2018 at 2:29 AM Florian Weimer <[hidden email]> wrote:
>> * Zack Weinberg:
>>> Specifically, I propose:  Effective immediately, project policy
>>> requires the Git commit message for each commit to end with the full
>>> text of a traditional ChangeLog entry for that commit.
>>
>> How can I see what is part of the proposed commit (message) and what is
>> not during patch review?
>
> If we take `git format-patch` as a baseline, that means the subject
> line is the first line of the commit message, and everything in the
> email body up to the first line of the actual diff is the commit
> message. But sometimes one wants to say things in the email that
> aren't going to go into the commit message.  I don't have a good
> suggestion for how to handle that, ...

Git already has a way to do that. The submitter can write anything they want to
between the '---' and 'diff ...'

That is '---' ends the commit message.

>
> zw
>
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Tulio Magno Quites Machado Filho-3
J William Piggott <[hidden email]> writes:

> On Tue, 27 Nov 2018, Zack Weinberg wrote:
>
>> On Tue, Nov 27, 2018 at 2:29 AM Florian Weimer <[hidden email]> wrote:
>>> * Zack Weinberg:
>>>> Specifically, I propose:  Effective immediately, project policy
>>>> requires the Git commit message for each commit to end with the full
>>>> text of a traditional ChangeLog entry for that commit.
>>>
>>> How can I see what is part of the proposed commit (message) and what is
>>> not during patch review?
>>
>> If we take `git format-patch` as a baseline, that means the subject
>> line is the first line of the commit message, and everything in the
>> email body up to the first line of the actual diff is the commit
>> message. But sometimes one wants to say things in the email that
>> aren't going to go into the commit message.  I don't have a good
>> suggestion for how to handle that, ...
>
> Git already has a way to do that. The submitter can write anything they want to
> between the '---' and 'diff ...'
>
> That is '---' ends the commit message.

Git also supports scissors (from git-mailinfo(1)):

       --scissors
           Remove everything in body before a scissors line. A line
           that mainly consists of scissors (either ">8" or "8<") and
           perforation (dash "-") marks is called a scissors line, and
           is used to request the reader to cut the message at that
           line. If such a line appears in the body of the message
           before the patch, everything before it (including the
           scissors line itself) is ignored when this option is used.

           This is useful if you want to begin your message in a
           discussion thread with comments and suggestions on the
           message you are responding to, and to conclude it with a
           patch submission, separating the discussion and the
           beginning of the proposed commit log message with a scissors line.

           This can be enabled by default with the configuration
           option mailinfo.scissors.

git am also supports this.

--
Tulio Magno
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Florian Weimer-5
* Tulio Magno Quites Machado Filho:

> J William Piggott <[hidden email]> writes:
>
>> On Tue, 27 Nov 2018, Zack Weinberg wrote:
>>
>>> On Tue, Nov 27, 2018 at 2:29 AM Florian Weimer <[hidden email]> wrote:
>>>> * Zack Weinberg:
>>>>> Specifically, I propose:  Effective immediately, project policy
>>>>> requires the Git commit message for each commit to end with the full
>>>>> text of a traditional ChangeLog entry for that commit.
>>>>
>>>> How can I see what is part of the proposed commit (message) and what is
>>>> not during patch review?
>>>
>>> If we take `git format-patch` as a baseline, that means the subject
>>> line is the first line of the commit message, and everything in the
>>> email body up to the first line of the actual diff is the commit
>>> message. But sometimes one wants to say things in the email that
>>> aren't going to go into the commit message.  I don't have a good
>>> suggestion for how to handle that, ...
>>
>> Git already has a way to do that. The submitter can write anything they want to
>> between the '---' and 'diff ...'
>>
>> That is '---' ends the commit message.
>
> Git also supports scissors (from git-mailinfo(1)):
>
>        --scissors
>            Remove everything in body before a scissors line. A line
>            that mainly consists of scissors (either ">8" or "8<") and
>            perforation (dash "-") marks is called a scissors line, and
>            is used to request the reader to cut the message at that
>            line. If such a line appears in the body of the message
>            before the patch, everything before it (including the
>            scissors line itself) is ignored when this option is used.
>
>            This is useful if you want to begin your message in a
>            discussion thread with comments and suggestions on the
>            message you are responding to, and to conclude it with a
>            patch submission, separating the discussion and the
>            beginning of the proposed commit log message with a scissors line.
>
>            This can be enabled by default with the configuration
>            option mailinfo.scissors.
>
> git am also supports this.

Does our Patchwork instance support this?  Are there Gnus/mutt
extensions to highlight the relevant parts of a patch submission?

I want to make sure that both patch authors and reviewers can tell with
high confidence what is proposed for committing.  If we want to generate
ChangeLogs from commit message contents, I think we need to increase the
quality of commit messages.

Does GNU offer hosting for a development workflow which is closer to the
pull request model?

Thanks,
Florian
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Tulio Magno Quites Machado Filho-3
Florian Weimer <[hidden email]> writes:

> * Tulio Magno Quites Machado Filho:
>
>> Git also supports scissors (from git-mailinfo(1)):
>>
>>        --scissors
>>            Remove everything in body before a scissors line. A line
>>            that mainly consists of scissors (either ">8" or "8<") and
>>            perforation (dash "-") marks is called a scissors line, and
>>            is used to request the reader to cut the message at that
>>            line. If such a line appears in the body of the message
>>            before the patch, everything before it (including the
>>            scissors line itself) is ignored when this option is used.
>>
>>            This is useful if you want to begin your message in a
>>            discussion thread with comments and suggestions on the
>>            message you are responding to, and to conclude it with a
>>            patch submission, separating the discussion and the
>>            beginning of the proposed commit log message with a scissors line.
>>
>>            This can be enabled by default with the configuration
>>            option mailinfo.scissors.
>>
>> git am also supports this.
>
> Does our Patchwork instance support this?

Here is an example: https://patchwork.sourceware.org/patch/29853/

It doesn't highlight the relevant parts, but it does remove the comments when
generating a patch.

> Are there Gnus/mutt extensions to highlight the relevant parts of a patch
> submission?

I don't think so.  Neither does notmuch.
The scissor area is treated as the commit message.

> If we want to generate ChangeLogs from commit message contents, I think we
> need to increase the quality of commit messages.

Agreed.

--
Tulio Magno
Reply | Threaded
Open this post in threaded view
|

Re: Automating the maintenance of the ChangeLog file

Joseph Myers
In reply to this post by Florian Weimer-5
On Wed, 28 Nov 2018, Florian Weimer wrote:

> Does our Patchwork instance support this?  Are there Gnus/mutt

I thought that the main requirement for patchwork to remove committed
patches automatically was that the git-patch-id of the committed change
was the same as that of the patch posting.  The patch-id does not depend
on the commit message, only on the diffs (but this does require that the
posted diffs be the same as the committed ones - in particular, committing
changes to the ChangeLog file that aren't included in the diffs prevents
such automatic removal at present).

--
Joseph S. Myers
[hidden email]
12