Remove unused modules across project

Hey folks, my goal is to remove all modules which are not referenced starting from the entry module. There are thousands of modules in this project, and manual labor is not a viable option.

Does elm-review or anything else help with this?

If your goal is to remove only unused modules (files), and doing so in an automated way, I’m not sure elm-review can help (but would love to be proved wrong)!

Someone else might have a much smarter and easier way, but I happen to have a script that finds all modules that are referenced from an entrypoint. Kind of.

So you could:

  1. Find all candidate .elm files for deletion.
  2. Run my script to get a list of the file paths not to remove from given entrypoint(s).
  3. Remove all candidate .elm files not mentioned in that list.
  4. Compile your entrypoint(s) and run your tests to check that everything still works.

So about that script.

In elm-watch I need to compute all file paths that some given entrypoints depend on, to know which .elm files should trigger recompilation when they change.

To benchmark that, I created a little CLI script that takes entrypoints as input and finds all the files they depend on recursively. It only prints the duration and the number of files found though. But it’s easy to patch it to print all the file paths. Note: Some file paths might not exist (because reasons), but I don’t think it should matter for your case.

Here’s how to run it (more or less):

  1. git clone git@github.com:lydell/elm-watch.git
  2. cd elm-watch
  3. npm ci
  4. Apply these changes:
    diff --git a/scripts/BenchmarkImportWalker.ts b/scripts/BenchmarkImportWalker.ts
    index 0427b96..458b6d7 100644
    --- a/scripts/BenchmarkImportWalker.ts
    +++ b/scripts/BenchmarkImportWalker.ts
    @@ -107,7 +107,7 @@ function run(args: Array<string>): void {
       console.timeEnd("Run");
       switch (result.tag) {
         case "Success":
    -      console.log("allRelatedElmFilePaths", result.allRelatedElmFilePaths.size);
    +      console.log(Array.from(result.allRelatedElmFilePaths).join("\n"));
           process.exit(0);
         case "ImportWalkerFileSystemError":
           console.error(result.error.message);
    
  5. Run the script: node -r esbuild-register scripts/BenchmarkImportWalker.ts path/to/Entrypoint.elm path/to/Second/Entrypoint.elm
1 Like

Hi @Birowsky :wave:

elm-review can help you with this with the NoUnused.Modules rule, but it won’t delete the files directly, you will have to do that manually. So it depends with what you mean with manual labor is not a viable option (not doing any work manually or not spending time figuring out the unused modules).

You can try it out here:

elm-review --template jfmengels/elm-review-unused/example --rules NoUnused.Modules

I have started jotting down ideas on how to support automatic file suppressions in autofixes in elm-review, so it will come one day.

Until then, if you really want this to be automated with deletion, you can probably create a script where you take the output of elm-review --report=json and have a script read that and delete the files automatically (documentation for the format here). You might need to run it several times (or in watch mode) until there are no more errors, as deleting unused modules might uncover more unused modules.

@lydell’s approach should work as well, but you should make sure you don’t delete tests/ or delete modules that contain a main that you were not thinking of. That will be handled in the case of NoUnused.Modules.

I highly recommend you then look at the other rules from the same package, as that will likely uncover a lot more unused code or files.

PS: NoUnused.Modules is actually deprecated, you should use NoUnused.Exports instead, though that does slightly more than what you were asking for here.

Have fun deleting code! :blush:

2 Likes

Hey Jeroen!

I did some basic setup, I ran this command:

elm-review --fix-all --ignore-dirs=src/scripts/Api,src/scripts/ElmFramework --ignore-files=src/scripts/AssetsImg.elm,src/scripts/AssetsSvg.elm,src/scripts/IconSvg.elm

it says that elm-format failed with “Unable to parse file”:

I also tried producing json output with:

elm-review --template jfmengels/elm-review-unused/example --fix-all --ignore-dirs 'src/scripts/ElmFramework','src/scripts/AWSWrapper' --report=json > out.json

But it does not produce JSON content inside the file. It’s instead a standard console output. With a prompt waiting for continuation.

Considering that I’m working in a proprietary repo, I won’t be able to share the project, but I’m open for a call, if you feel like. You can find me as birowsky on Slack.

I’ll now proceed with what mr. Lydell suggests.

The idea behind my suggestion for using --report=json is that you run only the NoUnused.Modules (--rules NoUnused.Modules) rule and interpret the output from the tool without the use of --fix-all.

I don’t recommend you run other rules than this one until you want to get to the next step. Using other rules will just slow you down in your current objective. Once it’s done, turn on other rules one by one.

If you want to try applying fixes AND using --report=json, you can probably do that though I have never tried it. In this case, you can use --fix-all-without-prompt which won’t prompt you and should therefore work better.


For the elm-format error, I’m wondering which error caused an elm-format parsing failure. It would be great if you could try to provide an SSCCE if/when you have the time. You have 810 fixes right now so it’s hard to tell you which one caused it, but maybe it’s the NoExposingEverything one that creates a very long line, longer than supported (Elm 0.19.1 only supports lines that are less than 65536 characters long IIRC, and I wouldn’t be surprised if that’s causing). Happy to help you debug (I’m jfmengels on Slack, but there’s also a #elm-review channel there).

BTW - I think I prefer small rules to big ones, so its possible to check for unused modules or exports independantly. Are you generally prefering to merge things into more comprehensive rules, or does the NoUnused.Exports come with a set of flags to select what is actually checks? Not that you have to listen to me, but I thought I would express this preference as a suggestion anyway, see if others think the same or different.

Similar with the NoUnused.Variables rules which also checks for unused imports. I’d like to clean up imports without having to run the full NoUnused.Variables sometimes.

In my opinion, it depends a lot, and I guess I search for what makes the better experience rather than caring about big or small rules.

For instance, the NoUnused.Modules got “merged” into NoUnused.Exports because when they were separate, the result was a bit terrible. Since NoUnused.Exports autofixed errors (while NoUnused.Modules didn’t), if there was an unused module, NoUnused.Exports would remove every export one by one (and then NoUnused.Variables would remove every function one by one) until there was nothing to remove. This was annoying and extremely time-consuming. With the 2 merged and caring about each other’s situation, we now don’t autofix exports for modules that are never imported and we let the user remove the module manually. (Well, until we have an automatic fix at least).

For NoUnused.Variables, we could go either way: keep it as is or split it up. The main reason why I’m thinking of keeping it as one is because computing unused imports and unused variables is very much the same work, so separating them into two rules would mean the amount of work would be doubled (meaning it’s slower).

One reason to separate the two would be for instance if you wanted to disable one of two report types but not the other (in this case, likely the variables part). In practice, I find the NoUnused.Variables rarely suppressed or not enabled (not that I have very much data though).

Happy to discuss this more (and be convinced otherwise) but probably better done in a separate thread :slightly_smiling_face:

4 Likes

Well gentlemen, I managed to pull this off. Here are the notes for my future self, or anyone else:

  • Clear out unused imports across the project using IntelliJ’s Elm plugin:


image

  • Use the following script to clear out unused modules, including the resulting empty folders.
import * as fs from 'fs-extra';
import * as path from 'path';

export {
  removeUnusedElmModules,
}

function removeUnusedElmModules(
  directory = 'src/scripts',
  entryModuleName = 'MainModule',
  excludeDirectories = [
    'src/scripts/Api',
    'src/scripts/ElmFramework',
    'review/src',
  ],
): void {
  const allElmPaths = findFiles(directory, excludeDirectories, '.elm');
  const allUsedElmPaths = getUsedElmPaths(directory, entryModuleName);
  const allUnusedElmPaths = excludeMembers(allElmPaths, allUsedElmPaths);
  removeFiles(allUnusedElmPaths);
  removeEmptyDirectories(directory);
}


function getUsedElmPaths(directoryPath: string, entryModuleName: string): string[] {
  return getUsedModules(directoryPath, entryModuleName)
    .map(moduleName => getModulePath(directoryPath, moduleName));

  function getUsedModules(baseDir: string, entryModuleName: string): string[] {
    const visitedModules = exploreModule(baseDir, new Set(), entryModuleName);
    return Array.from(visitedModules);
  }

  function exploreModule(baseDir: string, visitedModules: Set<string>, moduleName: string): Set<string> {
    if (visitedModules.has(moduleName)) return visitedModules;
    const modulePath = getModulePath(baseDir, moduleName);
    if (!fs.existsSync(modulePath)) return visitedModules;
    visitedModules.add(moduleName);
    const content = readModuleContent(modulePath);
    const imports = extractImports(content);
    imports.forEach((importName) => exploreModule(baseDir, visitedModules, importName));
    return visitedModules;
  }

  function readModuleContent(modulePath: string): string {
    return fs.readFileSync(modulePath, 'utf-8');
  }

  function extractImports(content: string): string[] {
    return (content.match(/^import\s+([^\s]+)/gm) || []).map((line) => line.split(' ')[1]);
  }

  function getModulePath(baseDir: string, moduleName: string): string {
    return path.join(baseDir, moduleName.replace(/\./g, '/') + '.elm');
  }
}

function removeFiles(filePaths: string[]): void {
  filePaths.forEach(filePath => {
    try {
      fs.unlinkSync(filePath);
    } catch (error) {
      console.error(`Failed to delete ${filePath}:`, error);
    }
  });
}

function excludeMembers(toChange: string[], excludeThese: string[]): string[] {
  const setB = new Set(excludeThese);
  return toChange.filter(item => !setB.has(item));
}


function findFiles(directory: string, excludedPaths: string[], extension: string): string[] {
  return excludedPaths.includes(directory)
    ? []
    : fs.readdirSync(directory).flatMap(decider);

  function decider(candidate: string): string[] {
    const fullPath = path.join(directory, candidate);
    const stat = fs.statSync(fullPath);

    return stat.isDirectory()
      ? findFiles(fullPath, excludedPaths, extension)
      : path.extname(candidate) === extension
        ? [fullPath]
        : [];
  }
}


function removeEmptyDirectories(directory: string): void {
  const files = fs.readdirSync(directory);

  files.forEach(file => {
    const fullPath = path.join(directory, file);
    const stat = fs.statSync(fullPath);

    if (stat.isDirectory()) {
      removeEmptyDirectories(fullPath);
    }
  });

  if (fs.readdirSync(directory).length === 0) {
    fs.rmdirSync(directory);
  }
}

Jeroen, I’ll try to reach out in a couple of days to see if you can debug the elm-review issues.

Cheers!

4 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.