Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Scala to Python - rdd folder#1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
jleetutorial merged 10 commits intomasterfrompedromb-scala_to_python
Sep 29, 2017
Merged

Conversation

pedromb
Copy link
Collaborator

Converted all the scala files on the RDD folder to python

frompysparkimportSparkContext
fromcommons.UtilsimportUtils

defsplitComma(line:str):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Have you tried to run this program? It doesn't compile

  File "/Users/cwei/code/python-spark-tutorial-new/rdd/airports/AirportsByLatitudeSolution.py", line 4    def splitComma(line: str):                       ^SyntaxError: invalid syntax

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi, yes I did ran all programs. Which version of python are you running? This should work in the latest Python 3 version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

sorry for the confusion. It works for Python 3. I was running python 2.7. Feel free to ignore this comment.

@jleetutorial
Copy link
Owner

For all the programs which print to standard output, please set the logging level to ERROR so that there is less noise in the output.

from pyspark import SparkContext
from commons.Utils import Utils

def splitComma(line: str):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Again, it didn't compile, I think you don't need the type.

  File "/Users/cwei/code/python-spark-tutorial-new/rdd/airports/AirportsInUsaSolution.py", line 4    def splitComma(line: str):                       ^SyntaxError: invalid syntax

from pyspark import SparkContext

if __name__ == "__main__":
sc = SparkContext("local", "collect")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please set the logging level to ERROR similar to what the Scala problem does to reduce the noise of the output

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hi James, some considerations about the logging level when using pyspark:

  • From the script itself, when using pyspark, we can only set the log level after starting the SparkContext, this means that logs printed when the SparkContext is starting will be printed anyway.
  • The best way to reduce the noise of the output is to configure the file log4j.properties inside spark/conf folder.
    That being said, I will set the log levels to ERROR after the SparkContext starts

@@ -0,0 +1,11 @@
from pyspark import SparkContext

if __name__ == "__main__":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

again, set the logging level to ERROR

@@ -0,0 +1,17 @@
from pyspark import SparkContext

def isNotHeader(line:str):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

doesn't compile

    def isNotHeader(line:str):                        ^SyntaxError: invalid syntax```

@@ -0,0 +1,8 @@
from pyspark import SparkContext

if __name__ == "__main__":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

set the logging level to ERROR

@jleetutorialjleetutorial merged commit9d9066c intomasterSep 29, 2017
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@jleetutorialjleetutorialjleetutorial approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@pedromb@jleetutorial

[8]ページ先頭

©2009-2025 Movatter.jp