bugmake - Bugs: bug #60297, optimize autodeps

 
 

bug #60297: optimize autodeps

Submitter:  Dmitry Goncharov <dgoncharov>
Submitted:  Sun 28 Mar 2021 01:06:47 AM UTC
   
 
Severity:  3 - Normal Item Group:  Enhancement
Status:  Fixed Privacy:  Public
Assigned to:  psmith Open/Closed:  Closed
Component Version:  None Operating System:  Any
Fixed Release:  4.4 Triage Status:  Small Effort
* Mandatory Fields

Add a New Comment Rich Markup
   

Jump to the original submission

Mon 06 Sep 2021 03:06:34 AM UTC, comment #19: 

Thanks, applied this to Git.

Paul D. Smith <psmith>
Group administrator
Tue 06 Apr 2021 12:14:19 AM UTC, comment #18: 

I strongly second Dmitry's proposal that something like .NOTINTERMEDIATE: (preferably with a no-dependencies-means-all interpretation) should be added.  The IMO weird behavior of .SECONDARY where it effectively creates strange weak dependencies just bit me for the second time. This time I just said a short prayer and removed .SECONDARY:, but what I really want is to disable the eccentric handling of intermediate files entirely.

This might add more implementation complexity and one more feature to consider, but it gives a road to simpler behavior.

Britton Kerin <bkerin>
Sun 04 Apr 2021 09:26:50 PM UTC, comment #17: 


> Just a note, in various examples you give prerequisites to the .SECONDEXPANSION target; these are ignored.


Indeed, i was thinking about having this feature and added %.o, but should not have.


> there's already a lot of complexity around intermediate/secondary/etc. files :)


agree



> The first is for %.h.  However, this is not really necessary as far as I can tell.  First, all the headers will be listed (after the $(file <%.d) is expanded) as explicit prerequisites of the target so they won't be intermediate files anyway.


i was thinking about whether the headers files are really intermediate in this case. Not what the current implementation does, but are they really intermediate? It is possible to code either way. i came to conclusion that header files are intermediate, because make learns about them through stem expansion in $(file <%.d). Since the %.d file, that the headers file come from, is intermediate, header files are also intermediate.







> Second, headers are almost always source files (not built by make and so not eligible to be removed).


agree

> The only time a source file would be removed is if the user deleted it, which is why the %.h pattern exists: solely as a way to keep make from complaining until the .d file can be rebuilt and the deleted header disappears.


agree

> In the rare situations where a header is an intermediate file (built from something else) you currently need to list it as a specific prerequisite anyway and people seem OK with that.


There are situations where it is difficult for the user to list files (header files or otherwise) explicitly. That's when implicit rules save us.


> The second is for %.d.  Assuming we have some variable or set of variables that lists all the object files to be built, which almost all makefiles must have or can have cheaply, we don't really need this one because we can force all the %.d files to be not intermediate by mentioning them somewhere as a prerequisite to some target.


I agree that explicitly listing all dep files eliminates a need for %.d pattern. There are situation when it is difficult. I presented some such situations in update 6. Obtaining this list usually comes with runtime cost of reading the filesystem and additional code in the makefile.

> I will agree that there's something nice about being able to just mark all %.d files as not intermediate without having to know all the .o files in a variable like this.  But is it worth the extra complexity?


i look at this as not "something nice". i look at .NOTINTERMEDIATE as a missing feature in make interface.

When the user asks "i am using implicit rules, how can i ensure that files which match a specific pattern are not intermediate?" the usual answer is "you can work around with listing all of them explicitly". If the user can list them explicitly, why use implicit rules?


Implementation wise there is additional code.
On the other hand, i feel, .NOTINTERMEDIATE simplifies make interface.
We can now describe make user interface as presented in update 10.
Let us repeat it here

1. make provides explicit rules.
2. make provides .INTERMEDIATE to accompany explicit rules to let the user mark a target of choice as intermediate, when it otherwise would be not intermediate.
3. make provides pattern rules with implicit search.
4. make provides .NOTINTERMEDIATE to accompany implicit rules to let the user mark a pattern of choice as not intermediate, when it otherwise would be intermediate.


i think of .SECONDARY as of a parametrized .INTERMEDIATE, rather than a part of this interface.
i suspect, had .NOTINTERMEDIATE existed since day 1, .SECONDARY may never have been implemented and the code base would have been simpler.

Dmitry Goncharov <dgoncharov>
Sun 04 Apr 2021 07:00:32 PM UTC, comment #16: 

I may be wrong about headers not being intermediate after $(file); I thought I had tested this but I just realized I hadn't saved the last edit to my makefile buffer.  I have to run now I'll look into this again later.

Paul D. Smith <psmith>
Group administrator
Sun 04 Apr 2021 06:56:16 PM UTC, comment #15: 

Just a note, in various examples you give prerequisites to the .SECONDEXPANSION target; these are ignored.  .SECONDEXPANSION is either on or off, for the entire makefile (starting with where it was defined).  It doesn't currently support being applied only to a subset of targets, although this could be added as a feature.

Thanks for the extra detail.  I want to preface this by just saying I don't really have anything against the concept of a NOTINTERMEDIATE facility, I'm just generally against adding complexity, and there's already a lot of complexity around intermediate/secondary/etc. files :)

There are two reasons to use this new target type in your proposal.  Let us consider them separately.

The first is for %.h.  However, this is not really necessary as far as I can tell.  First, all the headers will be listed (after the $(file <%.d) is expanded) as explicit prerequisites of the target so they won't be intermediate files anyway.  Second, headers are almost always source files (not built by make and so not eligible to be removed).  The only time a source file would be removed is if the user deleted it, which is why the %.h pattern exists: solely as a way to keep make from complaining until the .d file can be rebuilt and the deleted header disappears.  In the rare situations where a header is an intermediate file (built from something else) you currently need to list it as a specific prerequisite anyway and people seem OK with that.

The second is for %.d.  Assuming we have some variable or set of variables that lists all the object files to be built, which almost all makefiles must have or can have cheaply, we don't really need this one because we can force all the %.d files to be not intermediate by mentioning them somewhere as a prerequisite to some target.  For example:

    build-deps: $(OBJECTS:%.o=%.d)

will be sufficient.

I will agree that there's something nice about being able to just mark all %.d files as not intermediate without having to know all the .o files in a variable like this.  But is it worth the extra complexity?

Paul D. Smith <psmith>
Group administrator
Sat 03 Apr 2021 03:44:49 PM UTC, comment #14: 

Let me provide a verbose description of .NOTINTERMEDIATE here.


This piece of make code allows to get rid from include directive with generated dep files.

Motivation for this piece of make code is described as 1,2,3 and 4 in update 6.


.SECONDEXPANSION: %.o

%.o: %.c %.d $$(file <%.d)
    gcc $(CPPFLAGS) $(CFLAGS) -MD -MF $*.td -o $@ -c $<
    read obj src headers <$*.td; echo "$$headers" >$*.d
    touch -c $@

%.d: ;
%.h: ;

The only missing piece is that make considers .d and .h files intermediate.

In order for this piece of code to work we need to tell make that  files which match %.d and %.h are not intermediate.

.SECONDARY allows us to prevent make from deleting these files.

But, preventing removal is not enough.

.SECONDARY prevents deletion, but the file is still intermediate and thus, still gives make a green light to not rebuild a target when one of the intermediate prerequisites is missing.


When a .d file or .h is missing (not deleted by make, but for some other reason) we need to have the related rule run to generate a new .d file.


So, a mechanism is needed to accompany implicit rules to let the user mark chosen patterns as not intermediate (not secondary, but  full opposite of intermediate).


> But since the targets you are referring to are already intermediate this isn't an issue.


It is an issue. Because as long as .d and .h files are intermedaite make won't rebuild, if some .d or .h file is missing.


> As far as I can tell, that's the purpose of the .NOTINTERMEDIATE target you introduced: to prevent files from being removed so that $(file ...) can read them.


This is one of 2 purposes of .NOTINTERMEDIATE. The other purpose is to force a rebuild when .d or .h file is missing. That's why there are these 2 rules.
%.d: ;
%.h: ;


i am not describing why a rebuild is needed when .d or .h file is missing. You already described that well in your Auto-Dependency Generation article.

Hope this makes it clear.

Dmitry Goncharov <dgoncharov>
Sat 03 Apr 2021 02:54:02 PM UTC, comment #13: 

It may help to read the docs rather than, or at least in addition to, the code:

https://www.gnu.org/software/make/manual/html_node/Chained-Rules.html

You're right that if a target is not intermediate then .SECONDARY will make it intermediate.  But since the targets you are referring to are already intermediate this isn't an issue.

The main thing .SECONDARY does is stop intermediate files from being deleted.  As far as I can tell, that's the purpose of the .NOTINTERMEDIATE target you introduced: to prevent files from being removed so that $(file ...) can read them.  Maybe I didn't understand the justification for this new feature: your note just says it's needed without describing the problems you need to be solved.

Paul D. Smith <psmith>
Group administrator
Sat 03 Apr 2021 02:23:32 PM UTC, comment #12: 


> Yes, I'm saying we already have a way to mark things "not intermediate" (.SECONDARY)  The only difference between that and a brand new .NOTINTERMEDIATE you have proposed is that .SECONDARY doesn't handle patterns and .NOTINTERMEDIATE does.


Paul, this must be some sort of blindness on my side. I cannot see how .SECONDARY can be used to mark files as not intermediate.
As far as i can see, .SECONDARY does the opposite, it marks files as intermediate. With the only difference from .INTERMEDIATE is that the files are not getting removed.

Dmitry Goncharov <dgoncharov>
Sat 03 Apr 2021 02:01:36 PM UTC, comment #11: 


> Do you mean, that when .SECONDARY depends on a pattern the behavior is the same as that of .NOTINTERMEDIATE (as proposed here)?

Yes, I'm saying we already have a way to mark things "not intermediate" (.SECONDARY)  The only difference between that and a brand new .NOTINTERMEDIATE you have proposed is that .SECONDARY doesn't handle patterns and .NOTINTERMEDIATE does.

So, rather than introduce a new special target that overlaps partly with the existing .SECONDARY why not just enhance .SECONDARY to do everything .NOTINTERMEDIATE does and NOT introduce a brand new special target?

> That would be quite surprising for users, would not it?

In what way?

> make provides .NOTINTERMEDIATE to accompany implicit rules to let the user mark a pattern of choice as not intermediate, when it otherwise would be intermediate.

I agree that if we could go back in time (to the early 1990's or so) and suggest a different name than .SECONDARY that would be better, but of course we can't, and I'd rather have the name be a bit less obvious than have multiple different ways to get the same capability.

Paul D. Smith <psmith>
Group administrator
Fri 02 Apr 2021 02:03:00 AM UTC, comment #10: 

That may indeed be useful to have .SECONDARY accept patterns.

Do you mean, that when .SECONDARY depends on a pattern the behavior is the same as that of .NOTINTERMEDIATE (as proposed here)?
That would be quite surprising for users, would not it?

My look at this is the following
1. make provides explicit rules.
2. make provides .INTERMEDIATE to accompany explicit rules to let the user mark a target of choice as intermediate, when it otherwise would be not intermediate.
3. make provides pattern rules with implicit search.
4. make provides .NOTINTERMEDIATE to accompany implicit rules to let the user mark a pattern of choice as not intermediate, when it otherwise would be intermediate.

Dmitry Goncharov <dgoncharov>
Wed 31 Mar 2021 07:57:42 PM UTC, comment #9: 

I haven't looked at the patches but one thing to note is that we already have a target which is supposed to do what your .NOTINTERMEDIATE target does, as I understand it that's the same thing .SECONDARY is supposed to do.  The problem with .SECONDARY is, and has always been (.SECONDARY predates my involvement with GNU make), that it doesn't accept patterns, only filenames, which makes it pretty useless IMO (if you know the filenames you can just make a do-nothing target that lists them as prerequisites and get basically the same behavior).

Rather than introducing a new target, why not enhance the existing .SECONDARY to support applying it to patterns?

Paul D. Smith <psmith>
Group administrator
Sun 28 Mar 2021 02:46:31 PM UTC, comment #8: 

i read that article several times and indeed found it interesting.
In fact, i was using the technique described in your article, until my use cases forced me to come up with the technique described here.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 07:05:58 AM UTC, comment #7: 
Paul D. Smith <psmith>
Group administrator
Sun 28 Mar 2021 01:21:48 AM UTC, comment #6: 

The modern technique of tracking dependencies uses include directive.  While
this technique is infinitely superior to the manual maintenance of deps there
is still room for improvement.


1. include is not a part of the dag.


src:=$(wildcard *.c)
dfiles:=$(src:.c=.d)
%.o: %.c %.d
    gcc -I. -MMD -o $@ -c $<
%.d: ;
include $(dfiles)

When the user runs

$ make hello.o

make includes all dep files in the current directory, even though only hello.d
is needed. This is not optimal.



2. One makefile for multiple programs.

include is especially problematic when a project has one test driver per
source file. E.g. a lib can have implementation files api.c, util.c, and
engine.c and test programs api.t.c, util.t.c and engine.t.c. Each test program
contains function main and is supposed to be compiled and linked with the
module that it tests.


%.t.tsk: %.o %.t.o
    $(CC) -o $@ $^

When the user runs

$ make api.t.tsk

Only api.d and api.t.d are required to be included. However, our makefile shown
above will include all dep files.



3. One makefile for multiple projects.

Another scenario is a makefile containing a set of implicit rules and shared
between multiple projects.

include takes filenames as parameters. This hinders reuse of the makefile. We
would rather not hardcode filenames of included files in a makefile, but let
implicit rule search figure it out.



4. Phony targets.

Another issue with unconditional include is phony targets, such as clean, gzip,
etc.

The usual workaround is to to figure out the specified target and have a set of
ifeq statements to avoid including dep files. This piece of ifeq code
is difficult, especially when there are a lot of phony targets and the user
can specify multiple targets on the command line. E.g.

$ make api.t.tsk install



Solution.

It is possible to achieve the same automatic tracking of dependencies with
include (or rather its substitue) being a part of the dag.


.SECONDEXPANSION: %.o

%.o: %.c %.d $$(file <%.d)
    gcc $(CPPFLAGS) $(CFLAGS) -MD -MF $*.td -o $@ -c $<
    read obj src headers <$*.td; echo "$$headers" >$*.d
    touch -c $@

%.d: ;
%.h: ;

1. -MD generates a regular dep file.
Note, gcc option -MP is not used. The contents of dep files is of the form

api.o: api.c api.h <other header files>...\
    <more header files>...


2. Postprocessing.

read obj src headers <$*.td; echo "$$headers" >$*.d

is used to extract header files from the generated .td file and store this list of
headers files to a .d file.

The contents of the generated .d file is a space separated list of headers
files, all on one line.


3. $(file) appends the contents of %.d to the list of prerequisites.


4. Second expansion ensures $$(file) is expanded only when this rule is used to build
the current target. This is the magic maker here.



.NOTINTERMEIDATE.

There is still one missing piece in this makefile. Make considers generated dep
files and all headers files to be intermediate.  We need a mechanism to tell
make that all files which match %.d and %.h are not to be treated as
intermediate.


So, we introduce special target .NOTINTERMEDIATE and our makefile becomes


.NOTINTERMEDIATE: %.d %.h
.SECONDEXPANSION: %.o

%.o: %.c %.d $$(file <%.d)
    gcc $(CPPFLAGS) $(CFLAGS) -MD -MF $*.td -o $@ -c $<
    read obj src headers <$*.td; echo "$$headers" >$*.d
    touch -c $@

%.d: ;
%.h: ;


This makefile solves all the above described issues of unconditional include.


An additional bonus is that $(file) is faster than include. When the file
exists, include parses the file and evals its contents, and when the file is
missing include searches a list of directories. $(file) does none of that.


Another use case for .NOTINTERMEDIATE.

gcc has option -MP to generate an explicit target for each header file.

There are compilers which do not have such an option.

.NOTINTERMEDIATE: %.h

can be used to mark header files not intermediate with those compilers.


$$*.d vs %.d.

With the fix from https://savannah.gnu.org/bugs/?60188 it is possible to make
dep files and header files explicit in certain cases with


%.o: %.c $$*.d $$(file <$$*.d)

However, this workaround loses when the object file is not located in the
current directory. With %.d the implicit search algo prepends directory name to
the stem.


Another advantage of .NOTINTERMEDIATE over $$* is ability to be used with
built-in rules, if needed.



Notes.

Postprocessing in this example uses bash code and handles the gcc format of dep
files. To handle the one-rule-per-line format, that other compilers use, read
can be run in a loop.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:17:34 AM UTC, comment #5: 

i noticed that something messed up tabs in the examples in the original submission.
Please disregard original submission.
Let me resubmit.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:13:04 AM UTC, comment #4: 

Typo.

sv60297_notintermediate.diff contains implementation of special target .NOTINTERMEDIATE.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:12:00 AM UTC, comment #3: 

sv60297_notintermediate_doc.diff is a doc.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:11:45 AM UTC, comment #2: 

sv60297_notintermediate_test.diff is a test.

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:11:06 AM UTC, comment #1: 

sv60297_notintermediate.diff is implementation of special target .INTERMEDIATE.

(file #51149, file #51150, file #51151)

Dmitry Goncharov <dgoncharov>
Sun 28 Mar 2021 01:06:47 AM UTC, original submission:  

The modern technique of tracking dependencies uses include directive. While this technique is infinitely superior to the manual maintenance of deps there is still room for improvement.

1. include is not a part of the dag.

src:=$(wildcard *.c)
dfiles:=$(src:.c=.d)
%.o: %.c %.d
gcc -I. -MMD -o $@ -c $<
%.d: ;
include $(dfiles)

When the user runs

$ make hello.o

make includes all dep files in the current directory, even though only hello.d is needed. This is not optimal.

2. One makefile for multiple programs.

include is especially problematic when a project has one test driver per source file. E.g. a lib can have implementation files api.c, util.c, and engine.c and test programs api.t.c, util.t.c and engine.t.c. Each test program contains function main and is supposed to be compiled and linked with the module that it tests.

%.t.tsk: %.o %.t.o
$(CC) -o $@ $^

When the user runs

$ make api.t.tsk

Only api.d and api.t.d are required to be included. However, our makefile shown above will include all dep files.

3. One makefile for multiple projects.

Another scenario is a makefile containing a set of implicit rules and shared between multiple projects.

include takes filenames as parameters. This hinders reuse of the makefile. We would rather not hardcode filenames of included files in a makefile, but let implicit rule search figure it out.

4. Phony targets.

Another issue with unconditional include is phony targets, such as clean, gzip, etc.

The usual workaround is to to figure out the specified target and have a set of ifeq statements to avoid including dep files. This piece of ifeq code is difficult, especially when there are a lot of phony targets and the user can specify multiple targets on the command line. E.g.

$ make api.t.tsk install

Solution.

It is possible to achieve the same automatic tracking of dependencies with include (or rather its substitue) being a part of the dag.

.SECONDEXPANSION: %.o

%.o: %.c %.d $$(file <%.d)
gcc $(CPPFLAGS) $(CFLAGS) -MD -MF $*.td -o $@ -c $<
read obj src headers <$*.td; echo "$$headers" >$*.d
touch -c $@

%.d: ;
%.h: ;

    -MD generates a regular dep file. Note, gcc option -MP is not used. The contents of dep files is of the form

api.o: api.c api.h <other header files>...\
    <more header files>...

    Postprocessing.

read obj src headers <$*.td; echo "$$headers" >$*.d

is used to extract header files from the generated .td file and store this list of headers files to a .d file.

The contents of the generated .d file is a space separated list of headers files, all on one line.

    $(file) appends the contents of %.d to the list of prerequisites.

    Second expansion ensures $$(file) is expanded only when this rule is used to build the current target. This is the magic maker here.

.NOTINTERMEIDATE.

There is still one missing piece in this makefile. Make considers generated dep files and all headers files to be intermediate. We need a mechanism to tell make that all files which match %.d and %.h are not to be treated as intermediate.

So, we introduce special target .NOTINTERMEDIATE and our makefile becomes

.NOTINTERMEDIATE: %.d %.h
.SECONDEXPANSION: %.o

%.o: %.c %.d $$(file <%.d)
gcc $(CPPFLAGS) $(CFLAGS) -MD -MF $*.td -o $@ -c $<
read obj src headers <$*.td; echo "$$headers" >$*.d
touch -c $@

%.d: ;
%.h: ;

This makefile solves all the above described issues of unconditional include.

An additional bonus is that $(file) is faster than include. When the file exists, include parses the file and evals its contents, and when the file is missing include searches a list of directories. $(file) does none of that.
Another use case for .NOTINTERMEDIATE.

gcc has option -MP to generate an explicit target for each header file.

There are compilers which do not have such an option.

.NOTINTERMEDIATE: %.h

can be used to mark header files not intermediate with those compilers.
$$*.d vs %.d.

With the fix from https://savannah.gnu.org/bugs/?60188 it is possible to make dep files and header files explicit in certain cases with

%.o: %.c $$*.d $$(file <$$*.d)

However, this workaround loses when the object file is not located in the current directory. With %.d the implicit search algo prepends directory name to the stem.

Another advantage of .NOTINTERMEDIATE over $$* is ability to be used with built-in rules, if needed.
Notes.

Postprocessing in this example uses bash code and handles the gcc format of dep files. To handle the one-rule-per-line format, that other compilers use, read can be run in a loop.

Dmitry Goncharov <dgoncharov>

 

(Note: upload size limit is set to 16384 kB, after insertion of the required escape characters.)

Attach Files:
   
   
Comment:
   

Attached Files

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -email is unavailable- added by bkerin (Posted a comment)
  • -email is unavailable- added by psmith (Posted a comment)
  • -email is unavailable- added by dgoncharov (Submitted the item)
  •  

    There are 0 votes so far. Votes easily highlight which items people would like to see resolved in priority, independently of the priority of the item set by tracker managers.

    Only logged-in users can vote.

     

    Follow 9 latest changes.

    Date Changed by Updated Field Previous Value => Replaced by
    2021-09-06 psmith StatusNone Fixed
        Assigned toNone psmith
        Open/ClosedOpen Closed
        Operating SystemNone Any
        Fixed ReleaseNone 4.4
        Triage StatusNone Small Effort
    2021-03-28 dgoncharov Attached File- Added sv60297_notintermediate.diff, #51149
        Attached File- Added sv60297_notintermediate_test.diff, #51150
        Attached File- Added sv60297_notintermediate_doc.diff, #51151

    Back to the top

    Powered by Savane 3.13-4b48.
    Corresponding source code