I'm super new to coding. I had a random idea in the car to make this application in order dive head first into learning to code.
This is an app to take an MP3 file, "shred it" into pieces (creates 1 second mp3 files temporarily), then randomize the pieces, and then compiles a new MP3 file with all the "shreds" in a randomized order. In order to accomplish this, I used my very limited knowledge and so my code probably isn't written well.
Is there anything that you might change about my code? Any critiques will be helpful. I learned and put this together in about 10 hours. (I'm really enjoying this!)
A future goal I have is to make this functionality web-accessible so that my friends can use it without downloading executable code.
from pydub import AudioSegmentfrom mutagen.mp3 import MP3import randomimport osfrom glob import globfrom string import ascii_lowercasefrom random import choice, randint, randomimport sysimport timefrom PyQt5.QtWidgets import * from PyQt5.QtGui import * from PyQt5.QtCore import *from PyQt5 import QtCorefrom PyQt5 import QtWidgetsclass shredder(QWidget): def __init__(self, parent = None): super(shredder, self).__init__(parent) self.setMinimumSize(350, 100) #self.setFixedWidth(350) #self.setFixedHeight(100) layout = QVBoxLayout() self.btn = QPushButton("Load MP3") self.btn.clicked.connect(self.getfile) layout.addWidget(self.btn) self.le = QLabel("File Location:") self.le.setAlignment(QtCore.Qt.AlignCenter) layout.addWidget(self.le) self.btn1 = QPushButton("Shred") self.btn1.clicked.connect(self.shred) layout.addWidget(self.btn1) self.setLayout(layout) self.setWindowTitle("MP3Shredder") def getfile(self): global filepath filepath, _filter = QFileDialog.getOpenFileName(self, 'Open MP3 File', '',"MP3 (*.mp3)") if filepath: self.le.setText("File Location: "+filepath) def shred(self): global filepath try: filepath except NameError: self.le.setText("You Must First Load A MP3 File To Shred!") else: #importing file with pydub from location by giving its path song = AudioSegment.from_mp3(filepath) #creates an MP3 object with mutagen from file location and saves it as variable audio = MP3(filepath) audio_info = audio.info length = int(audio_info.length) hours, mins, seconds = audio_duration(length) print('Total Duration Of Song: {}:{}:{}'.format(hours, mins, seconds)) print("In Seconds: " + str(audio.info.length)) print("Creating " + str((audio.info.length//1)+1) +" shreds.") os.makedirs(os.path.join(r"C:\mp3shredder"), exist_ok=True) os.makedirs(os.path.join(r"C:\mp3shredder\temp"), exist_ok=True) print("Shredding...") #split sound in 1-second slices and export for i, chunk in enumerate(song[::1000]): with open(r"C:\mp3shredder\temp\piece%s.mp3" % i, "wb") as f: chunk.export(f, format="mp3") path = (r"C:\mp3shredder\temp") files = os.listdir(path) for index, file in enumerate(files): os.rename(os.path.join(path, file), os.path.join(path, ''.join([choice(ascii_lowercase) for _ in range(randint(5, 8))])+".mp3")) print("Shredding Complete!") print("Gluing...") shredded_pieces = [AudioSegment.from_mp3(mp3_file) for mp3_file in glob(r"C:\mp3shredder\temp\*.mp3")] glue_it_back = AudioSegment.empty() for song in shredded_pieces: glue_it_back += song filedir = os.path.dirname(filepath) filename = os.path.basename(filepath) glue_it_back.export(filedir + "/" + "SHREDDED_" + filename, format="mp3") dir = (r"C:\mp3shredder\temp") for f in os.listdir(dir): os.remove(os.path.join(dir, f)) os.rmdir(dir) print("Gluing Complete!") print("Created: " + filedir + "/" + "SHREDDED_" + filename) self.le.setText("Created: " + filedir + "/" + "SHREDDED_" + filename) os.startfile(filedir)# function to convert the information into # some readable formatdef audio_duration(length): hours = length // 3600 # calculate in hours length %= 3600 mins = length // 60 # calculate in minutes length %= 60 seconds = length # calculate in seconds return hours, mins, seconds # returns the duration def main(): app = QApplication(sys.argv) ex = shredder() ex.show() sys.exit(app.exec_())if __name__ == '__main__': main()- 8\$\begingroup\$Just... Why? In my imagination the output would sound gruesome.\$\endgroup\$Reinderien– Reinderien2022-08-16 03:07:17 +00:00CommentedAug 16, 2022 at 3:07
- 6\$\begingroup\$@Reinderien What, you've never wanted to BOGO sort a song?\$\endgroup\$Bee H.– Bee H.2022-08-16 14:54:07 +00:00CommentedAug 16, 2022 at 14:54
- 2\$\begingroup\$@Reinderien obviously you can apply a metric to the 1-second samples, sort in order of this metric, and then cut the last 30% out to make a better song\$\endgroup\$llama– llama2022-08-16 20:50:57 +00:00CommentedAug 16, 2022 at 20:50
- \$\begingroup\$I think you'll be interested to learn your idea is like granular synthesis. This is a fantastic presentation on the topic.youtu.be/FEK8Ggg2xxQ\$\endgroup\$DebrisHauler– DebrisHauler2022-08-17 03:03:15 +00:00CommentedAug 17, 2022 at 3:03
- \$\begingroup\$I like the idea, there's a lot of fun to be had in this space. You might like checking out Adam Edmond on youtubeyoutube.com/watch?v=AL1q-zZWViM\$\endgroup\$AJFaraday– AJFaraday2022-08-18 09:12:44 +00:00CommentedAug 18, 2022 at 9:12
4 Answers4
- Consider what happens if a user closes the "Load MP3" dialog without picking a file. In that case,
filepathwould get set to"". The label would not change, but should the user press the "Shred" button anyway, we would try to import a file called""(since thefilepathvariable exists, we don't get thatNameError) which would presumably not go terribly well - I'm not sure why we write the chunks to temporary files only to read them again? It seems like it'd be simpler to do something like this:
glued_song = AudioSegment.empty()for chunk in random.sample(song[::1000], song.duration_seconds): glued_song += chunk- Should you need to keep the temp files, I would advise using the
tempfilemodule to create it in a more secure and portable manner, in a standard location - You only seem to use
mutagento get the duration of the original track. Sincepydubsupports getting the duration ofAudioSegments, (usingsong.duration_seconds), taking on a whole other dependency just for that seems unnecessary - Instead of a global variable, it may be neater to have the currently selected file path be a property of the
shredderclass - you could set it likeself.filepath = Nonein the__init__, and then access it asself.filepathingetfileandshred - I do think allowing the user to pick the output file name might be a desirable feature
- Finally, I'm not a big fan of some of the variable names in use here:
shredder, as a class name, would typically be written likeShredderbtnandbtn1tells me very little about what those buttons are.load_buttonandshred_button, for example, would be more descriptive- Similarly, I haveno clue what
lemeans. I might go for something likepath_labelorselection_labelinstead getfileis two words -get_file. Or perhapsselect_file, orpick_file, to more clearly indicate a choice is being madesongis re-used for different purposes in the same method, which might get confusingglue_it_backsounds like less like athing and more like something youdo - it's a fine name for a function, but for an object I'd prefer something likeglued_song, or even justresult
- \$\begingroup\$You are so helpful and gracious! This has helped me tremendously!\$\endgroup\$Andrew Worley– Andrew Worley2022-08-19 05:58:00 +00:00CommentedAug 19, 2022 at 5:58
There is quite some room for improvement, both in the overall architecture and style and in the logic that you use:
Architecture and style
Separate logic from presentation
You have a single class handling both the logic ("shredding" the audio file) and the presentation (the PyQt widgets). This makes testing harder, as you can't call yourshred from a test runner without building a GUI, and reviewing more painful than it should: I am not willing to install a 250+ MiB package (PyQt5, in this case) just for a review.
In your case,shred could be a standalone method, taking anAudioFragment as argument and returning anotherAudioFragment.
Also, you print a bunch of things to the console while processing audio files, not only mixing logic and presentation, but mixing multiple kind of presentation -- GUI and console. I get that it can be useful while developing and debugging, but you should consider using a logger instead.
Don't use globals
You use a global variable forfilepath. Globals are usually considered to be bad practice, and while they can have legitimate uses, this is not one. It may work for now, but it's a foot gun: you could change the file path from anywhere else in the code, including outside the GUI class and have the GUI and the internal state out of sync, creating issues for the users. This is especially true given the generic variable name.
In your case, you could keep track of the path using an instance parameter (self.filepath) or pass it around method as an argument.
Naming
Use meaningful variable names. You mostly do, but the UI widgets names carry no information, seebtn andbtn1 for example.load_file_button andshred_button would be much more descriptive names.
Imagine working on a slightly more complex UI, you would lose tracks of what widget does whatvery quickly with your current naming.
Documentation
Your code contains no documentation. Docstrings would be very useful in understanding what the program does, especially for theshred method, which does something pretty unique to your program.
Logic
Don't write to disk if you can avoid it
There is no reason to write the song chunks to disk in your case. You could instead slice the song into chunks and put them into an array, shuffle them withrandom.shuffle and splice them in a newAudioFragment all in memory.
Side effects
What if the user has some personal data in a directory unfortunately namedC:\mp3shedder\temp? It gets nuked when running your program.
If you do have to write to and delete from disk (which you don't, in your case), you should probably be much more careful with how you do it.
Relative over absolute paths
You use hardcoded, absolute paths, which is more likely to introduce side effects when reading, writing and deleting from disk.
- 6\$\begingroup\$In your
Side Effects, you could point OP to thetempfile standard library module, which is specifically designed to create temporary files and directories. With a hard coded temp path also comes OS dependence which you could highlight in your answer\$\endgroup\$FlyingTeller– FlyingTeller2022-08-16 13:47:33 +00:00CommentedAug 16, 2022 at 13:47
First off, this is quite an interesting project for a beginner, well done on getting it working. Other people have mentioned a lot of the issues with our code style, however to expand on a few points:
General PEP-8
PEP-8 is the standard style recommendation for Python. In it, it includes things like how many blank lines, how much indentation, recommendations for variable naming and casing.
There are linters (Flake8,Pylint and others) which check your code against PEP-8 and issue useful warnings. In your case there are several simple and obvious things.
Whitespace
PEP-8 recommends multiples of 4 spaces for indentation, 2 blank lines between top-level functions/classes, 1 blank line between internal functions.
Operators are surrounded by whitespace.
x = a + bKeyword parameters are not surrounded by whitespace.
def func(a, b=None):Commas are followed by spaces.
('A', 'B', 'C')Imports
You importrandom twice, once as a plain library
import randomand once specific elements fromrandom
from random import choice, randint, randomWhich actually causes a name collision betweenrandom the library andrandom.random the function. Obviously not great.
It is also good practice to put Python standard library imports before general ones.
Importing* is generally considered bad practice as it might pollute your namespace making you call functions which you didn't expect to.
E.g. imagine a case where you do
from numpy import *from random import *Bothnumpy andrandom offer a function calledrandom which means you may not know which you are calling when you callrandom()
Instead import the library into the namespace or import only what you intend to use (linters will tell you about unused imports). In your case you are importing:
- time
- random.random
- PyQt5.QtWidgets
None of which are explicitly used in your code.
Case
PEP-8 suggests usingUPPER_SNAKE for top-level/global variables,PascalCase for classes andlower_snake for variables and functions.
shredder,filepath and others don't conform to this.
Separation of tasks
You have this object calledshredder which is not just a shredder, it is the GUI to perform shredding. We should split this off and allow it to exist separately from the GUI which may contain aShredder.
You should also have functions which are named to represent what they do.audio_duration, for example, doesn't actually care aboutaudio, it is just converting a time in seconds to hours, minutes, seconds. You might do one of two things with this, either rename the function tosplit_time or some similar name to describe what it actually does (with docstrings explaining what it does), or make it do what you say it does.
def audio_duration(audio: MP3): "Gets the duration of an audio track in hours, minutes and seconds" length = audio.info.length hours = length // 3600 # calculate in hours length %= 3600 mins = length // 60 # calculate in minutes length %= 60 seconds = length # calculate in seconds return hours, mins, seconds # returns the duration(N.B. There exists adivmod function to do this operation)
Comments, docstrings and type hints
Comments and docstrings are useful tools in python for helping other people understand what is going on, however, good comments are not easy to write. Some of your comments only describe what the code is already telling me:
hours = length // 3600 # calculate in hoursIs fairly self explanatory, honestly, but a comment saying that it "calculates in hours" is given by the fact that we're assigning it to a variable called hours. What the comment doesn't tell me is what we are calculating from. A better comment (and people will disagree, this is all a matter of taste) might be to explain what calculation you are actually performing.
hours = length // 3600 # Seconds to hoursLikewise docstrings are the main tool for other people to use your software. Have you ever typed
>>> import random>>> help(random.randint)Thehelp returns the docstring and tells you what the function is supposed to do, what arguments it takes, what it returns and in some cases how the result is reached. Good docstrings are also useful in code to remind yourself of what a function is doing. Again, a good docstring is hard to write, but some tools like sphinx have standard formats which can be helpful in guiding you as to how to write a good one, along with being able to automatically generate the API documentation for your code (without having to write it by hand). A simple docstring might look like
def audio_duration(length): """Converts duration in seconds to hours, minutes, seconds :param length: duration in seconds :returns: duration in hours, minutes, seconds :rtype: 3-tuple of ints """Whiletype hints are optional in Python, they are just another tool in the toolkit to help explain what your code is doing and are often useful to have for yourself and others to see what your code is doing.
String formats
You are using a mixture of old styles of string formats (% formats and.formats) the modern style is to use f-strings to make clear what's going on in your code.
print('Total Duration Of Song: {}:{}:{}'.format(hours, mins, seconds))print("In Seconds: " + str(audio.info.length))print("Creating " + str((audio.info.length//1)+1) + " shreds.")with open(r"C:\mp3shredder\temp\piece%s.mp3" % i, "wb") as f:become
print(f'Total Duration Of Song: {hours}:{mins}:{seconds}')print(f"In Seconds: {audio.info.length}")print(f"Creating {audio.info.length//1)+1} shreds.")with open(rf"C:\mp3shredder\temp\piece{i}.mp3", "wb") as f:Globals
Avoid using globals, they can cause many problems particularly in larger codes. What happens if you've used a global with a common name likefilepath and someone else has in their library? Who knows what might happen? It's just problems all around.
Instead just keep things in as local a scope as is feasible. Pass arguments into functions and use them locally. Changes to that argument then won't affect the global state (this is not always the case, be aware of what you change of dummy arguments) and it can be passed in from any name. Say I had 100 files I wanted toshred:
global filenamefor my_file in files_list: filename = my_file Shredder.shred()or
for my_file in filelist: Shredder.shred(my_file)Which makes the intent more obvious?
I think that's enough in terms of syntax and structure for now.
You can split mp3 files every 1000 bytes and stitch them together in random order. Here's what will happen: mp3 comes in mp3 blocks of maybe 100-200 bytes. Each block has a four byte header which describes the format of the block and lets you calculate the length. Chances that random 4 bytes match a header are about 1 in 2000. So a player will skip bytes until it finds what looks like a header. Then it calculates the size of the next block according to the header, checks that there is another header at that position, and then you can be reasonably sure you found two real headers, and the MP3 player will start playing. So you have say 10ms of silence at the start.
Since you stitch together 1000 byte blocks without respecting mp3 blocks, you will quite quickly run into a valid block header, followed by some valid bits, followed by invalid bits. Since the player assumes all the bits in an mp3 block are valid, the output will be rubbish. After that, the player recognises that you have no valid header, will skip to the next valid header, and repeat.
The result is that you have randomly reordered music (which is what you wanted), with silence and random rubbish mixed in. It would be much much better to scan through the file, identify the mp3 blocks, and mix them together anyway you like, to avoid the rubbish noises and silences.
Except... There are mp3 variable length modes. Basically, if your blocks should be say 160 bytes each, and one block is nicely compressible and uses only 130 bytes, then the next block uses the last 30 bytes of the previous block and therefore can use 190 instead of 160 bytes. So many blocks use bytes from the previous block. If this is used then you need to extract all the blocks, and determine which bytes each block really contains (add bytes it uses from the previous block, subtract bytes used from the next block). And then you can rearrange the blocks, and try to recreate them so that each block uses the data bytes it should, as far as you can. This won't be perfect, but better than not doing it.
A simpler way is to take the file, decode it, rearrange it in blocks of say 50 milliseconds, and re-encode the output to mp3. You will have lossy encoding, but nobody will notice in your scrambled music.
Do NOT encode small chunks, because each time you do this a random amount of silence is added. An mp3 file isalways a whole range of blocks of about 20-25ms. Rearrange the audio signal in memory instead, then encode the complete file. And you cannot just concatenate mp3 files because that will duplicate all the meta information.
- \$\begingroup\$
AudioSegment.from_mp3does, in fact, decode an mp3 file, and slicing the resulting object does, in fact, slice it based on time and not bytes\$\endgroup\$Sara J– Sara J2022-08-19 10:09:17 +00:00CommentedAug 19, 2022 at 10:09
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
